Query Builder¶

polars-redis provides a Polars-like query builder that generates RediSearch queries. This gives you a familiar, composable API without learning RediSearch query syntax.

Setup

Examples on this page use the test data from RediSearch Overview.

Basic Operations¶

Comparisons¶

from polars_redis import col

# Numeric comparisons
query = col("age") > 30       # @age:[(30 +inf]
query = col("age") >= 30      # @age:[30 +inf]
query = col("age") < 30       # @age:[-inf (30]
query = col("age") <= 30      # @age:[-inf 30]

# Equality (tags)
query = col("status") == "active"    # @status:{active}
query = col("status") != "inactive"  # -@status:{inactive}

Combining Conditions¶

# AND (both must match)
query = (col("age") > 30) & (col("status") == "active")
# (@age:[(30 +inf]) (@status:{active})

# OR (either matches)
query = (col("department") == "engineering") | (col("department") == "product")
# (@department:{engineering}) | (@department:{product})

# Negation
query = ~(col("status") == "inactive")
# -(@status:{inactive})

Raw Queries¶

For RediSearch features not covered by the builder, use raw():

from polars_redis import raw

# Full-text search
query = raw("@name:alice")

# Prefix search
query = raw("@name:ali*")

# Combine raw with builder
query = raw("@name:alice") & (col("age") > 25)

Text Search¶

Full-text Search¶

# Basic text search (with stemming)
query = col("title").contains("python")
# @title:python

# Prefix matching
query = col("name").starts_with("jo")
# @name:jo*

# Suffix matching
query = col("name").ends_with("son")
# @name:*son

# Substring/infix matching
query = col("description").contains_substring("data")
# @description:*data*

Fuzzy Matching¶

Match terms with typos using Levenshtein distance (1-3):

# Allow 1 character difference
query = col("title").fuzzy("python", distance=1)
# @title:%python%

# Allow 2 character differences
query = col("title").fuzzy("algorithm", distance=2)
# @title:%%algorithm%%

Phrase Search¶

Search for phrases with optional slop and order control:

# Exact phrase (words must appear consecutively)
query = col("title").phrase("hello", "world")
# @title:(hello world)

# Allow words between (slop = max intervening terms)
query = col("title").phrase("machine", "learning", slop=2)
# @title:(machine learning) => { $slop: 2; }

# Require in-order with slop
query = col("title").phrase("data", "science", slop=3, inorder=True)
# @title:(data science) => { $slop: 3; $inorder: true; }

Wildcard Matching¶

# Simple wildcard
query = col("name").matches("j*n")
# @name:j*n

# Exact wildcard matching
query = col("code").matches_exact("FOO*BAR?")
# @code:"w'FOO*BAR?'"

Multi-field Search¶

Search across multiple fields simultaneously:

from polars_redis import cols

# Search title and body together
query = cols("title", "body").contains("python")
# @title|body:python

# Prefix search across fields
query = cols("first_name", "last_name").starts_with("john")
# @first_name|last_name:john*

Tag Operations¶

# Single tag match
query = col("category").has_tag("electronics")
# @category:{electronics}

# Match any of multiple tags
query = col("tags").has_any_tag(["urgent", "important"])
# @tags:{urgent|important}

Geo Search¶

Radius Search¶

# Find locations within 10km of a point
query = col("location").within_radius(-122.4194, 37.7749, 10, "km")
# @location:[-122.4194 37.7749 10 km]

# Supported units: m, km, mi, ft
query = col("location").within_radius(-73.9857, 40.7484, 5, "mi")

Polygon Search¶

# Define polygon as list of (lon, lat) points
polygon = [
    (-122.5, 37.7),
    (-122.5, 37.8),
    (-122.3, 37.8),
    (-122.3, 37.7),
    (-122.5, 37.7),  # Close the polygon
]
query = col("location").within_polygon(polygon)
# @location:[WITHIN $poly]

Note

Polygon queries require passing the polygon as a query parameter. The query builder generates a parameterized query.

Vector Search¶

For semantic similarity search with vector embeddings:

K-Nearest Neighbors (KNN)¶

# Find 10 most similar documents
query = col("embedding").knn(10, "query_vec")
# *=>[KNN 10 @embedding $query_vec]

# Use with search_hashes (pass vector as parameter)
df = search_hashes(
    url,
    index="docs_idx",
    query=query,
    schema={"title": pl.Utf8, "embedding": pl.List(pl.Float32)},
    params={"query_vec": embedding_bytes},
)

Vector Range Search¶

# Find all documents within distance 0.5
query = col("embedding").vector_range(0.5, "query_vec")
# @embedding:[VECTOR_RANGE 0.5 $query_vec]

Relevance Tuning¶

Boosting¶

Increase the relevance score contribution of specific terms:

# Boost title matches 2x
query = col("title").contains("python").boost(2.0)
# (@title:python) => { $weight: 2.0; }

# Combine with other terms
title_query = col("title").contains("python").boost(2.0)
body_query = col("body").contains("python")
query = title_query | body_query

Optional Terms¶

Mark terms as optional for better ranking without requiring them:

# Documents with "python" required, "tutorial" optional but preferred
required = col("title").contains("python")
optional = col("title").contains("tutorial").optional()
query = required & optional
# @title:python ~@title:tutorial

Null Checks¶

# Find documents missing a field
query = col("email").is_null()
# ismissing(@email)

# Find documents with a field present
query = col("email").is_not_null()
# -ismissing(@email)

Debugging¶

Use to_redis() to inspect the generated RediSearch query:

query = (col("type") == "eBikes") & (col("price") < 1000)
print(query.to_redis())
# @type:{eBikes} @price:[-inf (1000]

Operations Reference¶

Operation	Example	RediSearch Output
Equal	`col("status") == "active"`	`@status:{active}`
Not Equal	`col("status") != "active"`	`-@status:{active}`
Greater Than	`col("age") > 30`	`@age:[(30 +inf]`
Greater or Equal	`col("age") >= 30`	`@age:[30 +inf]`
Less Than	`col("age") < 30`	`@age:[-inf (30]`
Less or Equal	`col("age") <= 30`	`@age:[-inf 30]`
And	`expr1 & expr2`	`(expr1) (expr2)`
Or	`expr1 \\| expr2`	`(expr1) \\| (expr2)`
Negate	`~expr` or `expr.negate()`	`-(...expr...)`
Contains	`col("title").contains("word")`	`@title:word`
Starts With	`col("name").starts_with("jo")`	`@name:jo*`
Ends With	`col("name").ends_with("son")`	`@name:*son`
Fuzzy	`col("name").fuzzy("john", 1)`	`@name:%john%`
Tag Match	`col("cat").has_tag("x")`	`@cat:{x}`
Tag Any	`col("cat").has_any_tag(["x","y"])`	`@cat:{x\\|y}`
Geo Radius	`col("loc").within_radius(lon, lat, r, "km")`	`@loc:[lon lat r km]`
KNN	`col("vec").knn(10, "param")`	`*=>[KNN 10 @vec $param]`
Boost	`expr.boost(2.0)`	`(expr) => { $weight: 2.0; }`
Optional	`expr.optional()`	`~(expr)`
Is Null	`col("field").is_null()`	`ismissing(@field)`

Client-Side Filters¶

Some operations can't be pushed to RediSearch. These fetch data first, then filter in Polars. Use them when RediSearch doesn't support the operation you need.

Performance

Client-side filters transfer all matching documents before filtering. Combine with server-side filters when possible to reduce data transfer.

Regex Matching¶

# Match email patterns (runs in Polars)
query = col("email").matches_regex(r".*@gmail\.com$")

# Combine with server-side filter for efficiency
query = (col("status") == "active") & col("email").matches_regex(r".*@company\.com")
# Server filters by status first, then Polars filters by regex

Case-Insensitive Matching¶

# Case-insensitive contains
query = col("name").icontains("john")

# Case-insensitive equality
query = col("department").iequals("ENGINEERING")

Multiple Substring Matching¶

# Match any of multiple substrings
query = col("description").contains_any(["python", "rust", "go"])

String Similarity¶

# Find names similar to "john" (80% similarity threshold)
query = col("name").similar_to("john", threshold=0.8)

Date/Time Operations¶

# Date comparisons
query = col("created_at").as_date() > "2024-01-01"
query = col("updated_at").as_datetime() >= "2024-06-15T12:00:00"

# Date part extraction
query = col("birth_date").as_date().year() == 1990
query = col("created_at").as_date().month().is_in([1, 2, 3])  # Q1
query = col("event_time").as_datetime().hour() >= 9

# Available date parts: year(), month(), day(), weekday(), hour(), minute()

Array/JSON Operations¶

# Check if array contains value
query = col("tags").array_contains("python")

# Filter by array length
query = col("items").array_len() > 5

# Extract nested JSON value
query = col("metadata").json_path("$.user.role") == "admin"

Hybrid Execution¶

When queries combine server-side and client-side operations, polars-redis automatically splits execution:

# This query has both server-pushable and client-side parts
query = (col("age") > 30) & col("email").matches_regex(r".*@gmail\.com")

# Check what runs where
print(query.is_client_side)  # True

# View execution plan
print(query.explain())
# RediSearch: @age:[(30 +inf]
# Polars filter: pl.col("email").str.contains(r".*@gmail\.com")

The query builder automatically optimizes for minimum data transfer by pushing what it can to RediSearch.