Fuzzy Functions¶

Fuzzy matching and string similarity functions.

Summary¶

Function	Signature	Description
`damerau_levenshtein`	`string, string -> number`	Damerau-Levenshtein distance
`hamming`	`string, string -> number\\|null`	Hamming distance (number of differing positions). Returns null if strings have different lengths
`jaro`	`string, string -> number`	Jaro similarity (0-1)
`jaro_winkler`	`string, string -> number`	Jaro-Winkler similarity (0-1)
`levenshtein`	`string, string -> number`	Levenshtein edit distance
`normalized_damerau_levenshtein`	`string, string -> number`	Normalized Damerau-Levenshtein similarity (0-1)
`normalized_levenshtein`	`string, string -> number`	Normalized Levenshtein (0-1)
`osa_distance`	`string, string -> number`	Optimal String Alignment distance (like Levenshtein but allows adjacent transpositions)
`sorensen_dice`	`string, string -> number`	Sorensen-Dice coefficient (0-1)

Functions¶

damerau_levenshtein¶

Damerau-Levenshtein distance

Signature: string, string -> number

Examples:

# Single transposition
damerau_levenshtein('ab', 'ba') -> 1
# Identical strings
damerau_levenshtein('hello', 'hello') -> 0
# Multiple edits
damerau_levenshtein('ca', 'abc') -> 2

CLI Usage:

echo '{}' | jpx 'damerau_levenshtein(`"ab"`, `"ba"`)'

hamming¶

Hamming distance (number of differing positions). Returns null if strings have different lengths

Signature: string, string -> number|null

Examples:

# Three differing positions
hamming('karolin', 'kathrin') -> 3
# Identical strings
hamming('hello', 'hello') -> 0
# Different lengths
hamming('hello', 'hi') -> null

CLI Usage:

echo '{}' | jpx 'hamming(`"karolin"`, `"kathrin"`)'

jaro¶

Jaro similarity (0-1)

Signature: string, string -> number

Examples:

# Similar words
jaro('hello', 'hallo') -> 0.866...
# Identical strings
jaro('hello', 'hello') -> 1.0
# Completely different
jaro('abc', 'xyz') -> 0.0

CLI Usage:

echo '{}' | jpx 'jaro(`"hello"`, `"hallo"`)'

jaro_winkler¶

Jaro-Winkler similarity (0-1)

Signature: string, string -> number

Examples:

# Similar words
jaro_winkler('hello', 'hallo') -> 0.88
# Identical strings
jaro_winkler('hello', 'hello') -> 1.0
# Completely different
jaro_winkler('abc', 'xyz') -> 0.0

CLI Usage:

echo '{}' | jpx 'jaro_winkler(`"hello"`, `"hallo"`)'

levenshtein¶

Levenshtein edit distance

Signature: string, string -> number

Examples:

# Classic example
levenshtein('kitten', 'sitting') -> 3
# Identical strings
levenshtein('hello', 'hello') -> 0
# All different
levenshtein('abc', 'def') -> 3

CLI Usage:

echo '{}' | jpx 'levenshtein(`"kitten"`, `"sitting"`)'

normalized_damerau_levenshtein¶

Normalized Damerau-Levenshtein similarity (0-1)

Signature: string, string -> number

Examples:

# Identical strings
normalized_damerau_levenshtein('hello', 'hello') -> 1.0
# Transposition
normalized_damerau_levenshtein('ab', 'ba') -> 0.5
# Completely different
normalized_damerau_levenshtein('abc', 'xyz') -> 0.0

CLI Usage:

echo '{}' | jpx 'normalized_damerau_levenshtein(`"hello"`, `"hello"`)'

normalized_levenshtein¶

Normalized Levenshtein (0-1)

Signature: string, string -> number

Examples:

# One edit
normalized_levenshtein('ab', 'abc') -> 0.666...
# Identical
normalized_levenshtein('hello', 'hello') -> 0.0
# All different
normalized_levenshtein('abc', 'xyz') -> 1.0

CLI Usage:

echo '{}' | jpx 'normalized_levenshtein(`"ab"`, `"abc"`)'

osa_distance¶

Optimal String Alignment distance (like Levenshtein but allows adjacent transpositions)

Signature: string, string -> number

Examples:

# Single transposition
osa_distance('ab', 'ba') -> 1
# Identical strings
osa_distance('hello', 'hello') -> 0
# Multiple edits
osa_distance('ca', 'abc') -> 3

CLI Usage:

echo '{}' | jpx 'osa_distance(`"ab"`, `"ba"`)'

sorensen_dice¶

Sorensen-Dice coefficient (0-1)

Signature: string, string -> number

Examples:

# Similar words
sorensen_dice('night', 'nacht') -> 0.25
# Identical strings
sorensen_dice('hello', 'hello') -> 1.0
# No common bigrams
sorensen_dice('abc', 'xyz') -> 0.0

CLI Usage:

echo '{}' | jpx 'sorensen_dice(`"night"`, `"nacht"`)'