Advanced Query Examples¶

This page demonstrates real-world use cases for the polars-redis query builder.

Vector Similarity Search¶

Find similar items using vector embeddings with KNN search.

Setup¶

# Create index with vector field
FT.CREATE products_idx ON HASH PREFIX 1 product: SCHEMA \
    name TEXT SORTABLE \
    description TEXT \
    category TAG \
    price NUMERIC SORTABLE \
    embedding VECTOR FLAT 6 TYPE FLOAT32 DIM 384 DISTANCE_METRIC COSINE

Find Similar Products¶

import polars as pl
import polars_redis as redis
from polars_redis import col

# Assume you have a function to generate embeddings
# embedding = get_embedding("wireless bluetooth headphones")

# Find 10 most similar products
query = col("embedding").knn(10, "query_vec")

df = redis.search_hashes(
    "redis://localhost:6379",
    index="products_idx",
    query=query,
    schema={
        "name": pl.Utf8,
        "description": pl.Utf8,
        "category": pl.Utf8,
        "price": pl.Float64,
    },
    params={"query_vec": embedding_bytes},
).collect()

print(df)

Hybrid Search: Vector + Filters¶

Combine vector similarity with traditional filters:

from polars_redis import col, raw

# Find similar products under $100 in electronics category
filter_query = (col("category") == "electronics") & (col("price") < 100)
vector_query = col("embedding").knn(10, "query_vec")

# Combine: filter first, then vector search
query = raw(f"({filter_query.to_redis()})=>[KNN 10 @embedding $query_vec]")

df = redis.search_hashes(
    "redis://localhost:6379",
    index="products_idx",
    query=query,
    schema={"name": pl.Utf8, "price": pl.Float64},
    params={"query_vec": embedding_bytes},
).collect()

Geo-Location Filtering¶

Filter data by geographic boundaries.

Setup¶

# Create index with geo field
FT.CREATE stores_idx ON HASH PREFIX 1 store: SCHEMA \
    name TEXT SORTABLE \
    address TEXT \
    location GEO \
    type TAG

Find Stores Within Radius¶

from polars_redis import col

# Find stores within 5km of downtown
query = col("location").within_radius(-122.4194, 37.7749, 5, "km")

df = redis.search_hashes(
    "redis://localhost:6379",
    index="stores_idx",
    query=query,
    schema={"name": pl.Utf8, "address": pl.Utf8, "type": pl.Utf8},
).collect()

print(f"Found {len(df)} stores nearby")

Filter by Delivery Zone (Polygon)¶

from polars_redis import col

# Define delivery zone polygon (lon, lat pairs)
delivery_zone = [
    (-122.45, 37.75),
    (-122.45, 37.80),
    (-122.38, 37.80),
    (-122.38, 37.75),
    (-122.45, 37.75),  # Close the polygon
]

query = col("location").within_polygon(delivery_zone)

# Combine with other filters
query = query & (col("type") == "restaurant")

df = redis.search_hashes(
    "redis://localhost:6379",
    index="stores_idx",
    query=query,
    schema={"name": pl.Utf8, "address": pl.Utf8},
).collect()

Fuzzy Name Matching¶

Handle typos and spelling variations in searches.

Setup¶

FT.CREATE customers_idx ON HASH PREFIX 1 customer: SCHEMA \
    name TEXT SORTABLE \
    email TAG \
    company TEXT

Search with Typo Tolerance¶

from polars_redis import col

# Find "Johnson" even with typos like "Jonson" or "Johnsen"
query = col("name").fuzzy("johnson", distance=1)

df = redis.search_hashes(
    "redis://localhost:6379",
    index="customers_idx",
    query=query,
    schema={"name": pl.Utf8, "email": pl.Utf8, "company": pl.Utf8},
).collect()

Deduplicate Records¶

Find potential duplicates using fuzzy matching:

from polars_redis import col

def find_duplicates(name: str, distance: int = 2):
    """Find records with similar names."""
    query = col("name").fuzzy(name, distance=distance)

    return redis.search_hashes(
        "redis://localhost:6379",
        index="customers_idx",
        query=query,
        schema={"name": pl.Utf8, "email": pl.Utf8},
    ).collect()

# Check for duplicates of a new entry
duplicates = find_duplicates("Micheal Smith")  # Will find "Michael Smith"
if len(duplicates) > 0:
    print("Potential duplicates found:")
    print(duplicates)

Full-Text Document Search¶

Search documents with phrase matching and relevance tuning.

Setup¶

FT.CREATE articles_idx ON HASH PREFIX 1 article: SCHEMA \
    title TEXT WEIGHT 2.0 SORTABLE \
    body TEXT \
    author TAG \
    tags TAG \
    published NUMERIC SORTABLE

Phrase Search with Proximity¶

from polars_redis import col

# Exact phrase match
query = col("body").phrase("machine", "learning")

# Allow words between (slop = max intervening words)
query = col("body").phrase("data", "science", slop=3)

# Require in-order with slop
query = col("title").phrase("getting", "started", slop=2, inorder=True)

df = redis.search_hashes(
    "redis://localhost:6379",
    index="articles_idx",
    query=query,
    schema={"title": pl.Utf8, "body": pl.Utf8, "author": pl.Utf8},
).collect()

Multi-Field Search¶

Search across multiple fields simultaneously:

from polars_redis import cols

# Search both title and body
query = cols("title", "body").contains("python")

df = redis.search_hashes(
    "redis://localhost:6379",
    index="articles_idx",
    query=query,
    schema={"title": pl.Utf8, "body": pl.Utf8},
).collect()

Relevance Boosting¶

Prioritize matches in important fields:

from polars_redis import col

# Title matches are more important than body matches
title_query = col("title").contains("python").boost(2.0)
body_query = col("body").contains("python")

query = title_query | body_query

df = redis.search_hashes(
    "redis://localhost:6379",
    index="articles_idx",
    query=query,
    schema={"title": pl.Utf8, "body": pl.Utf8},
    sort_by="__score",  # Sort by relevance score
).collect()

Optional Terms for Soft Preferences¶

Include optional terms that improve ranking but aren't required:

from polars_redis import col

# Must have "python", prefer "tutorial" or "beginner"
required = col("title").contains("python")
optional1 = col("title").contains("tutorial").optional()
optional2 = col("tags").has_tag("beginner").optional()

query = required & optional1 & optional2

# Results: All have "python", but those with "tutorial" 
# or "beginner" tag rank higher

Complex Filter Combinations¶

Build sophisticated queries combining multiple conditions.

E-commerce Product Search¶

from polars_redis import col

# Find electronics under $500, in stock, with good ratings
query = (
    (col("category") == "electronics")
    & (col("price").is_between(50, 500))
    & (col("in_stock") == True)
    & (col("rating") >= 4.0)
)

# Add optional preference for prime shipping
query = query & col("prime_eligible").has_tag("yes").optional()

df = redis.search_hashes(
    "redis://localhost:6379",
    index="products_idx",
    query=query,
    schema={
        "name": pl.Utf8,
        "price": pl.Float64,
        "rating": pl.Float64,
    },
    sort_by="rating",
    sort_ascending=False,
).collect()

Dynamic Query Building¶

Build queries from user input:

from polars_redis import col, match_all

def build_search_query(
    text: str | None = None,
    category: str | None = None,
    min_price: float | None = None,
    max_price: float | None = None,
    tags: list[str] | None = None,
):
    """Build a search query from user parameters."""
    conditions = []

    if text:
        conditions.append(col("title").contains(text))

    if category:
        conditions.append(col("category") == category)

    if min_price is not None and max_price is not None:
        conditions.append(col("price").is_between(min_price, max_price))
    elif min_price is not None:
        conditions.append(col("price") >= min_price)
    elif max_price is not None:
        conditions.append(col("price") <= max_price)

    if tags:
        conditions.append(col("tags").has_any_tag(tags))

    # Combine all conditions with AND
    if not conditions:
        return match_all()

    query = conditions[0]
    for condition in conditions[1:]:
        query = query & condition

    return query

# Example usage
query = build_search_query(
    text="laptop",
    category="electronics",
    max_price=1000,
    tags=["gaming", "portable"],
)

print(f"Generated query: {query.to_redis()}")

Combining Numeric, Tag, Text, and Geo¶

from polars_redis import col

# Find restaurants:
# - Within 2km of current location
# - Open now (hour check)
# - Rating >= 4.0
# - Cuisine matches preference
# - Has outdoor seating (optional bonus)

query = (
    col("location").within_radius(-122.4, 37.7, 2, "km")
    & col("rating") >= 4.0
    & col("cuisine").has_any_tag(["italian", "mediterranean"])
    & col("hours").contains("dinner")
)

# Prefer outdoor seating but don't require it
query = query & col("features").has_tag("outdoor").optional()

df = redis.search_hashes(
    "redis://localhost:6379",
    index="restaurants_idx",
    query=query,
    schema={
        "name": pl.Utf8,
        "rating": pl.Float64,
        "cuisine": pl.Utf8,
        "address": pl.Utf8,
    },
    sort_by="rating",
    sort_ascending=False,
).collect()

Debugging Queries¶

Use to_redis() to inspect generated queries:

from polars_redis import col

query = (
    (col("category") == "electronics")
    & (col("price") < 100)
    & col("title").fuzzy("wireless", distance=1)
)

# See the generated RediSearch query
print(query.to_redis())
# Output: @category:{electronics} @price:[-inf (100] @title:%wireless%

# Use this to:
# - Debug unexpected results
# - Copy to redis-cli for testing
# - Verify query syntax