Introduction

unimorph-rs is a complete Rust toolkit for working with UniMorph morphological data. It provides both a command-line interface and a Rust library for downloading, querying, and analyzing morphological inflection data across 180+ languages.

What is UniMorph?

UniMorph is a collaborative project providing morphological paradigms for the world's languages. Each language dataset contains entries mapping lemmas (dictionary forms) to their inflected forms along with morphological feature annotations.

For example, in Spanish:

Lemma	Form	Features
hablar	hablo	V;IND;PRS;1;SG
hablar	hablas	V;IND;PRS;2;SG
hablar	habla	V;IND;PRS;3;SG
hablar	hablamos	V;IND;PRS;1;PL

Features

Fast lookups: SQLite-backed storage with indexed queries
180+ languages: Access to all UniMorph language datasets
Transparent decompression: Handles .xz, .gz, and .zip compressed datasets automatically
Flexible querying: Search by lemma, form, features, or part of speech
Multiple output formats: Table, JSON, TSV for scripting
Pipe-friendly: Output designed for Unix pipelines
Offline-first: Data cached locally after download
Library + CLI: Use as a Rust library or command-line tool

Use Cases

Language learners: Look up conjugations and declensions
NLP researchers: Training data for morphological models
Lexicographers: Verify inflection paradigms
Educators: Build conjugation practice tools
Linguists: Cross-linguistic morphological analysis

Quick Example

# Download Hebrew dataset
unimorph download heb

# Look up all forms of a verb
unimorph inflect -l heb כתב

# Analyze a surface form
unimorph analyze -l heb כתבתי

# Search for plural masculine forms
unimorph search -l heb --contains PL,MASC --limit 10

Getting Started

Head to the Installation guide to get started, or jump straight to the Quick Start for a hands-on introduction.