Feature Schema

UniMorph uses a standardized feature schema to annotate morphological forms. Features are semicolon-separated and position-dependent within each language.

For the complete official specification, see the UniMorph Schema documentation (PDF).

Feature Format

FEATURE1;FEATURE2;FEATURE3;...

Example: V;IND;PRS;1;SG means:

  • V = Verb
  • IND = Indicative mood
  • PRS = Present tense
  • 1 = First person
  • SG = Singular number

Feature Dimensions

Part of Speech

FeatureDescription
VVerb
NNoun
ADJAdjective
ADVAdverb
PROPronoun
DETDeterminer
ADPAdposition
NUMNumeral
CONJConjunction
PARTParticle
INTJInterjection
V.MSDRVerbal noun / Masdar
V.PTCPParticiple
V.CVBConverb

Person

FeatureDescription
1First person
2Second person
3Third person
4Fourth person (obviate)
INCLInclusive
EXCLExclusive

Number

FeatureDescription
SGSingular
PLPlural
DUDual
TRITrial
PAUCPaucal
GRPLGreater plural

Gender

FeatureDescription
MASCMasculine
FEMFeminine
NEUTNeuter
NAKHAnimate (Algonquian)

Case

FeatureDescription
NOMNominative
ACCAccusative
GENGenitive
DATDative
INSInstrumental
LOCLocative
ABLAblative
VOCVocative
ESSEssive
TRANSTranslative
COMComitative
PRIVPrivative
PRTPartitive
And many more...

Tense

FeatureDescription
PRSPresent
PSTPast
FUTFuture
IPFVImperfective
PFVPerfective
PRFPerfect
PLPRFPluperfect
PROSPProspective

Aspect

FeatureDescription
IPFVImperfective
PFVPerfective
HABHabitual
PROGProgressive
ITERIterative

Mood

FeatureDescription
INDIndicative
SBJVSubjunctive
IMPImperative
CONDConditional
OPTOptative
POTPotential
PURPPurposive

Voice

FeatureDescription
ACTActive
PASSPassive
MIDMiddle
ANTIPAntipassive
CAUSCausative

Finiteness

FeatureDescription
FINFinite
NFINNon-finite

Definiteness

FeatureDescription
DEFDefinite
NDEFIndefinite
SPECSpecific
NSPECNon-specific

Comparison

FeatureDescription
CMPRComparative
SPRLSuperlative

Polarity

FeatureDescription
POSPositive
NEGNegative

Possession

FeatureDescription
PSS1S1st person singular possessor
PSS2S2nd person singular possessor
PSS3S3rd person singular possessor
PSS1P1st person plural possessor
PSS2P2nd person plural possessor
PSS3P3rd person plural possessor
PSSDPossessed form

Language-Specific Features

Some languages have additional features not listed above. Use unimorph features -l <lang> --list to see all features used in a specific language.

Feature Position

Feature positions vary by language. For example:

Hebrew verbs: V;PERSON;NUMBER;TENSE;GENDER

V;1;SG;PST     (1st person singular past)
V;3;PL;FUT;MASC (3rd person plural future masculine)

Spanish verbs: V;MOOD;TENSE;PERSON;NUMBER

V;IND;PRS;1;SG  (indicative present 1st singular)
V;SBJV;PST;3;PL (subjunctive past 3rd plural)

Working with Features

CLI

# List all features in a language
unimorph features -l heb --list

# See feature statistics
unimorph features -l heb --stats

# Find entries with a feature
unimorph features -l heb --search FUT

# Search by feature pattern
unimorph search -l heb -f "V;1;SG;*"

# Search by contained features
unimorph search -l heb --contains PL,MASC

Library

#![allow(unused)]
fn main() {
use unimorph_core::FeatureBundle;

let features: FeatureBundle = "V;1;SG;PST".parse()?;

// Check for specific feature
if features.contains("PST") {
    println!("Past tense");
}

// Pattern matching
if features.matches("V;*;SG;*") {
    println!("Singular verb");
}
}

References