4 releases (1 stable)
Uses new Rust 2024
| new 1.0.0 | Jun 6, 2026 |
|---|---|
| 0.5.0 | Jun 6, 2026 |
| 0.4.0 | Jun 6, 2026 |
| 0.3.0 | Jun 6, 2026 |
#3 in Algorithms
30 downloads per month
Used in 5 crates
(4 directly)
85KB
913 lines
iqdb-filter
iQDB HYBRID FILTERING
iqdb-filter is the metadata-filtering layer for the iQDB vector-database spine. It is the one place that decides what a Filter means: every index that honours metadata filters delegates here, so the semantics can never drift between implementations.
It evaluates the Filter expression language defined in iqdb-types against a record's metadata, with strict, predictable closed-world rules and validation that bounds every filter before it runs.
MSRV is 1.87+ (Rust 2024 edition). Validate once, evaluate per-row. No panics on hostile input. ~19 ns to evaluate a compound predicate.
Status: stable (1.0). The public API is committed under SemVer for the 1.x series — no breaking changes until 2.0. See CHANGELOG.md.
What it does
- Canonical evaluator — one implementation of
Filtersemantics, shared by every metadata-aware index so query results never diverge - Validate once, evaluate many —
FilterEvaluator::newchecks the filter (depth,Incardinality) a single time;evaluateis then infallible and runs per-row inside the search loop - Closed-world semantics — a leaf over an absent field is
false, type mismatches arefalse,NaNorderings arefalse, andNotof afalseleaf istrue(the "records without this field" idiom) - DoS-hardened — iterative validation that can't stack-overflow, with bounded depth and
Inwidth; the library never panics on adversarial input - Scan helpers —
prefilter/postfilterapply the evaluator as lazy, allocation-free iterator adapters over a stream of(key, metadata)pairs - Strategy selection — a selectivity estimate drives an automatic
PreFilter/PostFilterchoice, with a tunable threshold - Inverted index — an opt-in, per-field
MetadataIndexresolves selectiveEq/Inpredicates to a candidate set (a superset of true matches) and backs a sharper, count-based selectivity estimate - First-party only — depends solely on
iqdb-types, so it is unblocked today
Installation
[dependencies]
iqdb-filter = "1.0"
Quick start
Build an evaluator once, then test it against each record's metadata:
use iqdb_filter::FilterEvaluator;
use iqdb_types::{Filter, Metadata, Value};
// published == true AND year > 2000
let filter = Filter::and(vec![
Filter::eq("published", Value::Bool(true)),
Filter::gt("year", Value::Int(2000)),
]);
let evaluator = FilterEvaluator::new(filter).expect("valid filter");
let meta: Metadata = [
("published".to_string(), Value::Bool(true)),
("year".to_string(), Value::Int(2026)),
]
.into_iter()
.collect();
assert!(evaluator.evaluate(Some(&meta)));
assert!(!evaluator.evaluate(None)); // no metadata -> every leaf is false
The Not / absent-field idiom selects records that lack a field, or carry it with a non-matching value:
use iqdb_filter::FilterEvaluator;
use iqdb_types::{Filter, Value};
// "records that are not authored by ada" — including records with no author.
let evaluator =
FilterEvaluator::new(Filter::not(Filter::eq("author", Value::String("ada".into()))))
.expect("valid filter");
assert!(evaluator.evaluate(None));
Validation rejects pathological filters up front — bounded by the public caps:
use iqdb_filter::{FilterEvaluator, MAX_IN_VALUES};
use iqdb_types::{Filter, IqdbError, Value};
// An `In` set wider than the cap is refused before it can slow a query.
let huge = vec![Value::Int(0); MAX_IN_VALUES + 1];
let err = FilterEvaluator::new(Filter::is_in("tag", huge)).unwrap_err();
assert_eq!(err, IqdbError::InvalidFilter);
Apply a strategy with the scan helpers, or let the selectivity estimate pick one:
use iqdb_filter::{FilterEvaluator, FilterStrategy, choose_strategy};
use iqdb_types::{Filter, Metadata, Value};
let evaluator = FilterEvaluator::new(Filter::eq("lang", Value::String("rust".into())))
.expect("valid filter");
// `prefilter` keeps the keys of matching candidates, lazily, before scoring.
let rust: Metadata = [("lang".to_string(), Value::String("rust".into()))]
.into_iter()
.collect();
let go: Metadata = [("lang".to_string(), Value::String("go".into()))]
.into_iter()
.collect();
let rows = [(0_usize, Some(&rust)), (1, Some(&go))];
let kept: Vec<usize> = evaluator.prefilter(rows).collect();
assert_eq!(kept, [0]);
// An equality predicate is narrow, so the selector recommends pre-filtering.
assert_eq!(choose_strategy(&evaluator), FilterStrategy::PreFilter);
For repeated queries, build an opt-in MetadataIndex so a selective predicate resolves to a candidate set instead of scanning every row:
use iqdb_filter::{FilterEvaluator, MetadataIndex};
use iqdb_types::{Filter, Metadata, Value};
let rows = [
(0_usize, [("lang".to_string(), Value::String("rust".into()))].into_iter().collect::<Metadata>()),
(1, [("lang".to_string(), Value::String("go".into()))].into_iter().collect::<Metadata>()),
(2, [("lang".to_string(), Value::String("rust".into()))].into_iter().collect::<Metadata>()),
];
// Index only the `lang` field.
let index = MetadataIndex::build(&["lang"], rows.iter().map(|(k, m)| (*k, Some(m))));
let evaluator = FilterEvaluator::new(Filter::eq("lang", Value::String("rust".into())))
.expect("valid filter");
// `candidates` returns a superset of true matches; confirm with `evaluate`.
let mut hits: Vec<usize> = match index.candidates(&evaluator) {
Some(cands) => cands,
None => (0..rows.len()).collect(), // unbounded predicate -> full scan
};
hits.sort_unstable();
assert_eq!(hits, [0, 2]);
Errors
FilterEvaluator::new returns iqdb_types::Result; the only failure is
IqdbError::InvalidFilter, returned when a filter exceeds MAX_FILTER_DEPTH
nesting or carries an In node wider than MAX_IN_VALUES. After a filter is
validated, evaluate is infallible and never panics — including on records
with no metadata, type mismatches, and NaN values.
Status
v1.0.0 — stable. The full surface — the canonical
FilterEvaluator (validate-on-construction, infallible allocation-free per-row
evaluate), the prefilter / postfilter scan helpers, estimate_selectivity
- the selector (
choose_strategy/StrategySelector), and the opt-in per-fieldMetadataIndex— is committed under SemVer for the 1.x series: no breaking changes until 2.0. It is exercised by unit, integration, and property tests, a consumer-simulation suite that builds a filtered top-ksearcher on the public API alone, and fuzz targets that drive the no-panic and superset contracts over unbounded input; all verified across the CI matrix (Linux, macOS, Windows) on stable and the 1.87 MSRV. The one remaining feature,InFilterpushdown into graph traversal, is additive (FilterStrategyis#[non_exhaustive]) and will ship in a later 1.x release when an approximate-index consumer drives it (see theROADMAP). The full surface is documented indocs/API.md.
Where It Fits
iqdb-filter sits just above the types crate and is consumed by the index layer:
iqdb-types— theFilter,Metadata, andValuetypes it evaluatesiqdb-flat/iqdb-hnsw/iqdb-ivf— delegate here for metadata filteringiqdb— exposes filtered search to users
Its only first-party dependency is iqdb-types, so it is unblocked today.
Standards
Built to the iQDB Rust standard. See REPS.md (Rust Efficiency & Performance Standards) and dev/DIRECTIVES.md for the engineering law and the definition of done. Before a PR: cargo fmt --all, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-features must be clean.
License
Licensed under either of
- Apache License, Version 2.0 — LICENSE-APACHE
- MIT License — LICENSE-MIT
at your option.
Dependencies
~0.5–1MB
~21K SLoC