Fuzzy Functions
Fuzzy matching and string similarity functions.
Summary
| Function | Signature | Description |
|---|---|---|
damerau_levenshtein | string, string -> number | Damerau-Levenshtein distance |
hamming | string, string -> number|null | Hamming distance (number of differing positions). Returns null if strings have different lengths |
jaro | string, string -> number | Jaro similarity (0-1) |
jaro_winkler | string, string -> number | Jaro-Winkler similarity (0-1) |
levenshtein | string, string -> number | Levenshtein edit distance |
normalized_damerau_levenshtein | string, string -> number | Normalized Damerau-Levenshtein similarity (0-1) |
normalized_levenshtein | string, string -> number | Normalized Levenshtein (0-1) |
osa_distance | string, string -> number | Optimal String Alignment distance (like Levenshtein but allows adjacent transpositions) |
sorensen_dice | string, string -> number | Sorensen-Dice coefficient (0-1) |
Functions
damerau_levenshtein
Damerau-Levenshtein distance
Signature: string, string -> number
Examples:
# Single transposition
damerau_levenshtein('ab', 'ba') -> 1
# Identical strings
damerau_levenshtein('hello', 'hello') -> 0
# Multiple edits
damerau_levenshtein('ca', 'abc') -> 2
CLI Usage:
echo '{}' | jpx 'damerau_levenshtein(`"ab"`, `"ba"`)'
hamming
Hamming distance (number of differing positions). Returns null if strings have different lengths
Signature: string, string -> number|null
Examples:
# Three differing positions
hamming('karolin', 'kathrin') -> 3
# Identical strings
hamming('hello', 'hello') -> 0
# Different lengths
hamming('hello', 'hi') -> null
CLI Usage:
echo '{}' | jpx 'hamming(`"karolin"`, `"kathrin"`)'
jaro
Jaro similarity (0-1)
Signature: string, string -> number
Examples:
# Similar words
jaro('hello', 'hallo') -> 0.866...
# Identical strings
jaro('hello', 'hello') -> 1.0
# Completely different
jaro('abc', 'xyz') -> 0.0
CLI Usage:
echo '{}' | jpx 'jaro(`"hello"`, `"hallo"`)'
jaro_winkler
Jaro-Winkler similarity (0-1)
Signature: string, string -> number
Examples:
# Similar words
jaro_winkler('hello', 'hallo') -> 0.88
# Identical strings
jaro_winkler('hello', 'hello') -> 1.0
# Completely different
jaro_winkler('abc', 'xyz') -> 0.0
CLI Usage:
echo '{}' | jpx 'jaro_winkler(`"hello"`, `"hallo"`)'
levenshtein
Levenshtein edit distance
Signature: string, string -> number
Examples:
# Classic example
levenshtein('kitten', 'sitting') -> 3
# Identical strings
levenshtein('hello', 'hello') -> 0
# All different
levenshtein('abc', 'def') -> 3
CLI Usage:
echo '{}' | jpx 'levenshtein(`"kitten"`, `"sitting"`)'
normalized_damerau_levenshtein
Normalized Damerau-Levenshtein similarity (0-1)
Signature: string, string -> number
Examples:
# Identical strings
normalized_damerau_levenshtein('hello', 'hello') -> 1.0
# Transposition
normalized_damerau_levenshtein('ab', 'ba') -> 0.5
# Completely different
normalized_damerau_levenshtein('abc', 'xyz') -> 0.0
CLI Usage:
echo '{}' | jpx 'normalized_damerau_levenshtein(`"hello"`, `"hello"`)'
normalized_levenshtein
Normalized Levenshtein (0-1)
Signature: string, string -> number
Examples:
# One edit
normalized_levenshtein('ab', 'abc') -> 0.666...
# Identical
normalized_levenshtein('hello', 'hello') -> 0.0
# All different
normalized_levenshtein('abc', 'xyz') -> 1.0
CLI Usage:
echo '{}' | jpx 'normalized_levenshtein(`"ab"`, `"abc"`)'
osa_distance
Optimal String Alignment distance (like Levenshtein but allows adjacent transpositions)
Signature: string, string -> number
Examples:
# Single transposition
osa_distance('ab', 'ba') -> 1
# Identical strings
osa_distance('hello', 'hello') -> 0
# Multiple edits
osa_distance('ca', 'abc') -> 3
CLI Usage:
echo '{}' | jpx 'osa_distance(`"ab"`, `"ba"`)'
sorensen_dice
Sorensen-Dice coefficient (0-1)
Signature: string, string -> number
Examples:
# Similar words
sorensen_dice('night', 'nacht') -> 0.25
# Identical strings
sorensen_dice('hello', 'hello') -> 1.0
# No common bigrams
sorensen_dice('abc', 'xyz') -> 0.0
CLI Usage:
echo '{}' | jpx 'sorensen_dice(`"night"`, `"nacht"`)'