Python API Reference¶
Scan Functions¶
scan_hashes¶
def scan_hashes(
url: str,
pattern: str = "*",
schema: dict | None = None,
*,
include_key: bool = True,
key_column_name: str = "_key",
include_ttl: bool = False,
ttl_column_name: str = "_ttl",
include_row_index: bool = False,
row_index_column_name: str = "_index",
batch_size: int = 1000,
count_hint: int = 100,
) -> pl.LazyFrame
Scan Redis hashes matching a pattern and return a LazyFrame.
Parameters:
url: Redis connection URLpattern: Key pattern to match (e.g.,"user:*")schema: Dictionary mapping field names to Polars dtypesinclude_key: Include Redis key as a columnkey_column_name: Name of the key columninclude_ttl: Include TTL as a columnttl_column_name: Name of the TTL columninclude_row_index: Include row index columnrow_index_column_name: Name of the index columnbatch_size: Keys per batchcount_hint: Redis SCAN COUNT hint
Returns: pl.LazyFrame
scan_json¶
def scan_json(
url: str,
pattern: str = "*",
schema: dict | None = None,
*,
include_key: bool = True,
key_column_name: str = "_key",
include_ttl: bool = False,
ttl_column_name: str = "_ttl",
include_row_index: bool = False,
row_index_column_name: str = "_index",
batch_size: int = 1000,
count_hint: int = 100,
) -> pl.LazyFrame
Scan RedisJSON documents matching a pattern and return a LazyFrame.
Parameters are identical to scan_hashes.
scan_strings¶
def scan_strings(
url: str,
pattern: str = "*",
*,
value_type: type[pl.DataType] = pl.Utf8,
include_key: bool = True,
key_column_name: str = "_key",
value_column_name: str = "value",
batch_size: int = 1000,
count_hint: int = 100,
) -> pl.LazyFrame
Scan Redis string values matching a pattern and return a LazyFrame.
Parameters:
url: Redis connection URLpattern: Key pattern to matchvalue_type: Polars dtype for value column (default:pl.Utf8)include_key: Include Redis key as a columnkey_column_name: Name of the key columnvalue_column_name: Name of the value columnbatch_size: Keys per batchcount_hint: Redis SCAN COUNT hint
Returns: pl.LazyFrame
scan_sets¶
def scan_sets(
url: str,
pattern: str = "*",
*,
include_key: bool = True,
key_column_name: str = "_key",
member_column_name: str = "member",
include_row_index: bool = False,
row_index_column_name: str = "_index",
batch_size: int = 1000,
count_hint: int = 100,
) -> pl.LazyFrame
Scan Redis sets matching a pattern and return a LazyFrame with one row per member.
Parameters:
url: Redis connection URLpattern: Key pattern to matchinclude_key: Include Redis key as a columnkey_column_name: Name of the key columnmember_column_name: Name of the member columninclude_row_index: Include row index columnrow_index_column_name: Name of the index columnbatch_size: Keys per batchcount_hint: Redis SCAN COUNT hint
Returns: pl.LazyFrame
scan_lists¶
def scan_lists(
url: str,
pattern: str = "*",
*,
include_key: bool = True,
key_column_name: str = "_key",
element_column_name: str = "element",
include_position: bool = False,
position_column_name: str = "position",
include_row_index: bool = False,
row_index_column_name: str = "_index",
batch_size: int = 1000,
count_hint: int = 100,
) -> pl.LazyFrame
Scan Redis lists matching a pattern and return a LazyFrame with one row per element.
Parameters:
url: Redis connection URLpattern: Key pattern to matchinclude_key: Include Redis key as a columnkey_column_name: Name of the key columnelement_column_name: Name of the element columninclude_position: Include position indexposition_column_name: Name of the position columninclude_row_index: Include row index columnrow_index_column_name: Name of the index columnbatch_size: Keys per batchcount_hint: Redis SCAN COUNT hint
Returns: pl.LazyFrame
scan_zsets¶
def scan_zsets(
url: str,
pattern: str = "*",
*,
include_key: bool = True,
key_column_name: str = "_key",
member_column_name: str = "member",
score_column_name: str = "score",
include_rank: bool = False,
rank_column_name: str = "rank",
include_row_index: bool = False,
row_index_column_name: str = "_index",
batch_size: int = 1000,
count_hint: int = 100,
) -> pl.LazyFrame
Scan Redis sorted sets matching a pattern and return a LazyFrame with one row per member.
Parameters:
url: Redis connection URLpattern: Key pattern to matchinclude_key: Include Redis key as a columnkey_column_name: Name of the key columnmember_column_name: Name of the member columnscore_column_name: Name of the score columninclude_rank: Include rank indexrank_column_name: Name of the rank columninclude_row_index: Include row index columnrow_index_column_name: Name of the index columnbatch_size: Keys per batchcount_hint: Redis SCAN COUNT hint
Returns: pl.LazyFrame
scan_streams¶
def scan_streams(
url: str,
pattern: str = "*",
fields: list[str] = [],
*,
start_id: str = "-",
end_id: str = "+",
count_per_stream: int | None = None,
include_key: bool = True,
key_column_name: str = "_key",
include_id: bool = True,
id_column_name: str = "_id",
include_timestamp: bool = True,
timestamp_column_name: str = "_ts",
include_sequence: bool = False,
sequence_column_name: str = "_seq",
include_row_index: bool = False,
row_index_column_name: str = "_index",
batch_size: int = 1000,
count_hint: int = 100,
) -> pl.LazyFrame
Scan Redis Streams matching a pattern and return a LazyFrame with one row per entry.
Parameters:
url: Redis connection URLpattern: Key pattern to matchfields: Field names to extract from entriesstart_id: Start entry ID (default: "-" for oldest)end_id: End entry ID (default: "+" for newest)count_per_stream: Maximum entries per stream (optional)include_key: Include Redis key as a columnkey_column_name: Name of the key columninclude_id: Include entry ID as a columnid_column_name: Name of the entry ID columninclude_timestamp: Include timestamp as a columntimestamp_column_name: Name of the timestamp columninclude_sequence: Include sequence number as a columnsequence_column_name: Name of the sequence columninclude_row_index: Include row index columnrow_index_column_name: Name of the index columnbatch_size: Keys per batchcount_hint: Redis SCAN COUNT hint
Returns: pl.LazyFrame
scan_timeseries¶
def scan_timeseries(
url: str,
pattern: str = "*",
*,
start: str = "-",
end: str = "+",
count_per_series: int | None = None,
aggregation: str | None = None,
bucket_size_ms: int | None = None,
include_key: bool = True,
key_column_name: str = "_key",
include_timestamp: bool = True,
timestamp_column_name: str = "_ts",
value_column_name: str = "value",
include_row_index: bool = False,
row_index_column_name: str = "_index",
label_columns: list[str] = [],
batch_size: int = 1000,
count_hint: int = 100,
) -> pl.LazyFrame
Scan RedisTimeSeries matching a pattern and return a LazyFrame with one row per sample.
Parameters:
url: Redis connection URLpattern: Key pattern to matchstart: Start timestamp (default: "-" for oldest)end: End timestamp (default: "+" for newest)count_per_series: Maximum samples per time series (optional)aggregation: Aggregation type (avg, sum, min, max, range, count, first, last, std.p, std.s, var.p, var.s)bucket_size_ms: Bucket size in milliseconds (required with aggregation)include_key: Include Redis key as a columnkey_column_name: Name of the key columninclude_timestamp: Include timestamp as a columntimestamp_column_name: Name of the timestamp columnvalue_column_name: Name of the value columninclude_row_index: Include row index columnrow_index_column_name: Name of the index columnlabel_columns: Label names to include as columnsbatch_size: Keys per batchcount_hint: Redis SCAN COUNT hint
Returns: pl.LazyFrame
Read Functions (Eager)¶
read_hashes¶
Eager version of scan_hashes. Parameters are identical.
read_json¶
Eager version of scan_json. Parameters are identical.
read_strings¶
Eager version of scan_strings. Parameters are identical.
read_sets¶
Eager version of scan_sets. Parameters are identical.
read_lists¶
Eager version of scan_lists. Parameters are identical.
read_zsets¶
Eager version of scan_zsets. Parameters are identical.
read_streams¶
Eager version of scan_streams. Parameters are identical.
read_timeseries¶
Eager version of scan_timeseries. Parameters are identical.
Write Functions¶
write_hashes¶
def write_hashes(
df: pl.DataFrame,
url: str,
key_column: str | None = "_key",
ttl: int | None = None,
key_prefix: str = "",
if_exists: str = "replace",
) -> int
Write a DataFrame to Redis as hashes.
Parameters:
df: DataFrame to writeurl: Redis connection URLkey_column: Column with Redis keys, orNonefor auto-generatedttl: TTL in seconds (optional)key_prefix: Prefix for all keysif_exists:"fail","replace", or"append"
Returns: Number of keys written
write_json¶
def write_json(
df: pl.DataFrame,
url: str,
key_column: str | None = "_key",
ttl: int | None = None,
key_prefix: str = "",
if_exists: str = "replace",
) -> int
Write a DataFrame to Redis as JSON documents.
Parameters are identical to write_hashes.
write_strings¶
def write_strings(
df: pl.DataFrame,
url: str,
key_column: str | None = "_key",
value_column: str = "value",
ttl: int | None = None,
key_prefix: str = "",
if_exists: str = "replace",
) -> int
Write a DataFrame to Redis as string values.
Parameters:
df: DataFrame to writeurl: Redis connection URLkey_column: Column with Redis keys, orNonefor auto-generatedvalue_column: Column with values to writettl: TTL in seconds (optional)key_prefix: Prefix for all keysif_exists:"fail","replace", or"append"
Returns: Number of keys written
write_sets¶
def write_sets(
df: pl.DataFrame,
url: str,
key_column: str | None = "_key",
member_column: str = "member",
ttl: int | None = None,
key_prefix: str = "",
if_exists: str = "replace",
) -> int
Write a DataFrame to Redis as sets.
Parameters:
df: DataFrame to writeurl: Redis connection URLkey_column: Column with Redis keys, orNonefor auto-generatedmember_column: Column with member valuesttl: TTL in seconds (optional)key_prefix: Prefix for all keysif_exists:"fail","replace", or"append"
Returns: Number of keys written
write_lists¶
def write_lists(
df: pl.DataFrame,
url: str,
key_column: str | None = "_key",
element_column: str = "element",
ttl: int | None = None,
key_prefix: str = "",
if_exists: str = "replace",
) -> int
Write a DataFrame to Redis as lists.
Parameters:
df: DataFrame to writeurl: Redis connection URLkey_column: Column with Redis keys, orNonefor auto-generatedelement_column: Column with element valuesttl: TTL in seconds (optional)key_prefix: Prefix for all keysif_exists:"fail","replace", or"append"
Returns: Number of keys written
write_zsets¶
def write_zsets(
df: pl.DataFrame,
url: str,
key_column: str | None = "_key",
member_column: str = "member",
score_column: str = "score",
ttl: int | None = None,
key_prefix: str = "",
if_exists: str = "replace",
) -> int
Write a DataFrame to Redis as sorted sets.
Parameters:
df: DataFrame to writeurl: Redis connection URLkey_column: Column with Redis keys, orNonefor auto-generatedmember_column: Column with member valuesscore_column: Column with score valuesttl: TTL in seconds (optional)key_prefix: Prefix for all keysif_exists:"fail","replace", or"append"
Returns: Number of keys written
Detailed Write Functions¶
write_hashes_detailed¶
def write_hashes_detailed(
df: pl.DataFrame,
url: str,
key_column: str | None = "_key",
ttl: int | None = None,
key_prefix: str = "",
if_exists: str = "replace",
) -> WriteResult
Write a DataFrame to Redis as hashes with detailed error reporting.
Parameters: Same as write_hashes.
Returns: WriteResult object with per-key success/failure information.
WriteResult¶
class WriteResult:
keys_written: int # Number of keys successfully written
keys_failed: int # Number of keys that failed
keys_skipped: int # Number of keys skipped (if_exists="fail")
succeeded_keys: list[str] # List of successfully written keys
failed_keys: list[str] # List of keys that failed
errors: dict[str, str] # Map of failed keys to error messages
def is_complete_success(self) -> bool:
"""Check if all keys were written successfully."""
Query Builder¶
col¶
Create a column expression for building RediSearch queries.
Parameters:
name: Field name to query
Returns: Expr object
cols¶
Create a multi-field expression for searching across multiple fields.
Parameters:
*names: Field names to search across
Returns: MultiFieldExpr object
raw¶
Create a raw RediSearch query expression.
Parameters:
query: Raw RediSearch query string
Returns: Expr object
Expr Methods¶
Comparison Operators¶
| Operator | Description | Example |
|---|---|---|
> |
Greater than | col("age") > 30 |
>= |
Greater or equal | col("age") >= 30 |
< |
Less than | col("age") < 30 |
<= |
Less or equal | col("age") <= 30 |
== |
Equal | col("status") == "active" |
!= |
Not equal | col("status") != "deleted" |
Logical Operators¶
| Operator | Description | Example |
|---|---|---|
& |
AND | (col("a") > 1) & (col("b") < 2) |
\| |
OR | (col("a") == 1) \| (col("a") == 2) |
~ |
NOT | ~(col("status") == "deleted") |
Range and Membership¶
Text Search¶
Expr.contains(text: str) -> Expr # Full-text search
Expr.starts_with(prefix: str) -> Expr # Prefix match
Expr.ends_with(suffix: str) -> Expr # Suffix match
Expr.contains_substring(text: str) -> Expr # Infix/substring match
Expr.matches(pattern: str) -> Expr # Wildcard match
Expr.matches_exact(pattern: str) -> Expr # Exact wildcard match
Expr.fuzzy(term: str, distance: int = 1) -> Expr # Fuzzy match (distance 1-3)
Expr.phrase(*words: str, slop: int = None, inorder: bool = None) -> Expr
Tag Operations¶
Geo Operations¶
Expr.within_radius(lon: float, lat: float, radius: float, unit: str = "km") -> Expr
Expr.within_polygon(points: list[tuple[float, float]]) -> Expr
Vector Search¶
Expr.knn(k: int, vector_param: str = "query_vec") -> Expr
Expr.vector_range(radius: float, vector_param: str = "query_vec") -> Expr
Relevance¶
Expr.boost(weight: float) -> Expr # Increase relevance weight
Expr.optional() -> Expr # Mark as optional (prefer but don't require)
Null Checks¶
Query Output¶
MultiFieldExpr Methods¶
MultiFieldExpr.contains(text: str) -> Expr # Search across all fields
MultiFieldExpr.starts_with(prefix: str) -> Expr # Prefix match across fields
DataFrame Caching¶
cache_dataframe¶
def cache_dataframe(
df: pl.DataFrame,
url: str,
key: str,
*,
format: Literal["ipc", "parquet"] = "ipc",
compression: str | None = None,
compression_level: int | None = None,
ttl: int | None = None,
) -> int
Cache a DataFrame in Redis using Arrow IPC or Parquet format.
Parameters:
df: DataFrame to cacheurl: Redis connection URLkey: Redis key for storageformat: Serialization format ("ipc"or"parquet")compression: Compression codec (IPC: lz4, zstd; Parquet: snappy, gzip, lz4, zstd)compression_level: Compression level (codec-specific)ttl: TTL in seconds (optional)
Returns: Number of bytes written
get_cached_dataframe¶
def get_cached_dataframe(
url: str,
key: str,
*,
format: Literal["ipc", "parquet"] = "ipc",
columns: list[str] | None = None,
n_rows: int | None = None,
) -> pl.DataFrame | None
Retrieve a cached DataFrame from Redis.
Parameters:
url: Redis connection URLkey: Redis key to retrieveformat: Serialization format (must match cache_dataframe)columns: Columns to read (Parquet only)n_rows: Maximum rows to read (Parquet only)
Returns: DataFrame or None if key doesn't exist
scan_cached¶
def scan_cached(
url: str,
key: str,
*,
format: Literal["ipc", "parquet"] = "ipc",
) -> pl.LazyFrame | None
Retrieve cached data as a LazyFrame.
Returns: LazyFrame or None if key doesn't exist
delete_cached¶
Delete a cached DataFrame from Redis.
Returns: True if deleted, False if key didn't exist
cache_exists¶
Check if a cached DataFrame exists in Redis.
cache_ttl¶
Get the remaining TTL of a cached DataFrame.
Returns: TTL in seconds, or None if no TTL or key doesn't exist
Index Management¶
Index¶
class Index:
def __init__(
self,
name: str,
prefix: str | list[str] = "",
schema: list[Field] = [],
on: Literal["HASH", "JSON"] = "HASH",
stopwords: list[str] | None = None,
language: str | None = None,
language_field: str | None = None,
score: float | None = None,
score_field: str | None = None,
payload_field: str | None = None,
maxtextfields: bool = False,
nooffsets: bool = False,
nohl: bool = False,
nofields: bool = False,
nofreqs: bool = False,
skipinitialscan: bool = False,
)
RediSearch index definition.
Parameters:
name: Index nameprefix: Key prefix(es) to indexschema: List of field definitionson: Data type ("HASH"or"JSON")stopwords: Custom stopwords list (empty list disables)language: Default language for stemminglanguage_field: Field containing per-document languagescore: Default document scorescore_field: Field containing per-document scorepayload_field: Field to use as document payloadmaxtextfields: Optimize for many TEXT fieldsnooffsets: Don't store term offsets (saves memory)nohl: Don't store data for highlightingnofields: Don't store field namesnofreqs: Don't store term frequenciesskipinitialscan: Don't scan existing keys when creating
Methods:
Index.create(url: str, *, if_not_exists: bool = False) -> None
Index.drop(url: str, *, delete_docs: bool = False) -> None
Index.exists(url: str) -> bool
Index.ensure_exists(url: str, *, recreate: bool = False) -> Index
Index.info(url: str) -> IndexInfo | None
Index.diff(url: str) -> IndexDiff
Index.migrate(url: str, *, drop_existing: bool = False) -> bool
Index.validate_schema(schema: dict) -> list[str]
# Class methods
Index.from_frame(df, name, prefix, *, text_fields=None, sortable=None, on="HASH") -> Index
Index.from_schema(schema, name, prefix, *, text_fields=None, sortable=None, on="HASH") -> Index
Index.from_redis(url: str, name: str) -> Index | None
Field Types¶
TextField¶
class TextField(Field):
def __init__(
self,
name: str,
sortable: bool = False,
nostem: bool = False,
weight: float = 1.0,
phonetic: str | None = None,
noindex: bool = False,
withsuffixtrie: bool = False,
)
Full-text search field with stemming and scoring.
NumericField¶
class NumericField(Field):
def __init__(
self,
name: str,
sortable: bool = False,
noindex: bool = False,
)
Numeric field for range queries.
TagField¶
class TagField(Field):
def __init__(
self,
name: str,
separator: str = ",",
casesensitive: bool = False,
sortable: bool = False,
noindex: bool = False,
withsuffixtrie: bool = False,
)
Exact-match field for categories and tags.
GeoField¶
Geographic field for radius queries.
VectorField¶
class VectorField(Field):
def __init__(
self,
name: str,
algorithm: Literal["FLAT", "HNSW"] = "HNSW",
dim: int = 384,
distance_metric: Literal["COSINE", "L2", "IP"] = "COSINE",
initial_cap: int | None = None,
m: int | None = None,
ef_construction: int | None = None,
ef_runtime: int | None = None,
block_size: int | None = None,
)
Vector field for similarity search.
GeoShapeField¶
class GeoShapeField(Field):
def __init__(
self,
name: str,
coord_system: Literal["SPHERICAL", "FLAT"] = "SPHERICAL",
)
Polygon/geometry field for complex geo queries.
IndexInfo¶
class IndexInfo:
name: str
num_docs: int
max_doc_id: int
num_terms: int
num_records: int
inverted_sz_mb: float
total_inverted_index_blocks: int
offset_vectors_sz_mb: float
doc_table_size_mb: float
sortable_values_size_mb: float
key_table_size_mb: float
records_per_doc_avg: float
bytes_per_record_avg: float
offsets_per_term_avg: float
offset_bits_per_record_avg: float
hash_indexing_failures: int
indexing: bool
percent_indexed: float
fields: list[dict]
prefixes: list[str]
on_type: str
Information about an existing RediSearch index.
IndexDiff¶
class IndexDiff:
added: list[Field] # Fields to be added
removed: list[str] # Fields to be removed
changed: dict[str, tuple] # Fields with type changes
unchanged: list[str] # Unchanged fields
@property
def has_changes(self) -> bool
Differences between desired and existing index schemas.
Smart Scan (Auto-detection)¶
smart_scan¶
def smart_scan(
url: str,
pattern: str = "*",
schema: dict | None = None,
*,
index: str | Index | None = None,
include_key: bool = True,
key_column_name: str = "_key",
include_ttl: bool = False,
ttl_column_name: str = "_ttl",
batch_size: int = 1000,
auto_detect_index: bool = True,
) -> pl.LazyFrame
Smart scan that automatically detects and uses RediSearch indexes when available.
Parameters:
url: Redis connection URLpattern: Key pattern to match (e.g.,"user:*")schema: Dictionary mapping field names to Polars dtypes (required)index: Force use of specific index (name or Index object)include_key: Include Redis key as a columnkey_column_name: Name of the key columninclude_ttl: Include TTL as a columnttl_column_name: Name of the TTL columnbatch_size: Documents per batchauto_detect_index: Auto-detect matching indexes (default: True)
Returns: pl.LazyFrame
explain_scan¶
def explain_scan(
url: str,
pattern: str = "*",
schema: dict | None = None,
filter_expr: pl.Expr | None = None,
) -> QueryPlan
Explain how a scan would be executed without running it.
Parameters:
url: Redis connection URLpattern: Key pattern to matchschema: Schema dictionaryfilter_expr: Optional Polars filter expression
Returns: QueryPlan object
find_index_for_pattern¶
Find a RediSearch index that covers the given key pattern.
Parameters:
url: Redis connection URLpattern: Key pattern (e.g.,"user:*")
Returns: DetectedIndex if found, None otherwise
list_indexes¶
List all RediSearch indexes.
Parameters:
url: Redis connection URL
Returns: List of DetectedIndex objects
ExecutionStrategy¶
class ExecutionStrategy(Enum):
SEARCH = "search" # Use FT.SEARCH with index
SCAN = "scan" # Use SCAN without index
HYBRID = "hybrid" # Use FT.SEARCH + client-side filtering
Enum representing query execution strategies.
DetectedIndex¶
@dataclass
class DetectedIndex:
name: str # Index name
prefixes: list[str] # Key prefixes covered
on_type: str # "HASH" or "JSON"
fields: list[str] # Indexed field names
Information about an auto-detected index.
QueryPlan¶
@dataclass
class QueryPlan:
strategy: ExecutionStrategy
index: DetectedIndex | None = None
server_query: str | None = None
client_filters: list[str] = []
warnings: list[str] = []
def explain(self) -> str:
"""Return human-readable explanation of the query plan."""
Execution plan for a query.
Schema Inference¶
infer_hash_schema¶
def infer_hash_schema(
url: str,
pattern: str = "*",
*,
sample_size: int = 100,
type_inference: bool = True,
) -> dict[str, type[pl.DataType]]
Infer schema from Redis hashes by sampling keys.
Parameters:
url: Redis connection URLpattern: Key pattern to samplesample_size: Maximum keys to sampletype_inference: Infer types (vs all Utf8)
Returns: Dictionary mapping field names to Polars dtypes
infer_json_schema¶
def infer_json_schema(
url: str,
pattern: str = "*",
*,
sample_size: int = 100,
) -> dict[str, type[pl.DataType]]
Infer schema from RedisJSON documents by sampling keys.
Parameters:
url: Redis connection URLpattern: Key pattern to samplesample_size: Maximum keys to sample
Returns: Dictionary mapping field names to Polars dtypes
Utility Functions¶
scan_keys¶
Scan Redis keys matching a pattern.
Parameters:
url: Redis connection URLpattern: Key pattern to matchcount: Maximum keys to return (optional)
Returns: List of matching keys