Query language
Bindu accepts a small query language for selecting, filtering, aggregating, searching, and editing compressed files directly — without decompressing them. It’s deliberately narrower than SQL: the goal is predicate pushdown and symbol-space lookups against compressed coordinates, not a full query engine.
This is the user-facing surface of Bindu’s computable property — operating on compressed data without a decompression round-trip.
Search and edit without decompressing
Section titled “Search and edit without decompressing”The canonical demo:
# Find every occurrence of "Alice" across the compressed archivebindu search archive.bindu --term "Alice"
# Rename Alice to George — everywhere — without unpackingbindu edit archive.bindu --replace "Alice" --with "George"The search runs against the symbol table directly. The edit rewrites the affected symbol entries while the rest of the archive is untouched. Both operations pay coordinate-space cost, not full decompress-modify-recompress cost.
The same primitives generalize: search a compressed telemetry archive for “every left turn at 30°”, search a compressed log archive for an error template, search a compressed monorepo for an AST shape. If the thing you’re looking for has a symbolic representation, you can search for it without decompressing.
Expressions
Section titled “Expressions”Literals:
42 -- int3.14 -- float"hello" -- stringtrue -- boolnull2026-01-23T00:00:00Z -- timestamp (RFC 3339)Column references:
servicequality.perplexity -- dotted for nested fieldsmetadata["user_id"] -- bracketed for map keystags[0] -- bracketed for array indicesOperators:
== != < <= > >=&& || !+ - * / %in, not in, matches (regex)Examples:
service == "checkout"level in ("error", "fatal")ts >= 2026-01-01 && ts < 2026-02-01msg matches "timeout.*"amount_cents > 10000 && service != "marketing"The --where flag
Section titled “The --where flag”--where EXPR accepts any boolean expression. Predicate pushdown applies automatically — Bindu skips files, chunks, and columns that can’t produce matches based on Bloom filters and per-chunk min/max statistics.
bindu query logs.bindu --where 'service == "checkout" && level == "error"'The --select flag
Section titled “The --select flag”Comma-separated list of columns and expressions:
bindu query logs.bindu --select 'ts, service, msg'bindu query logs.bindu --select 'service, count(*), avg(latency_ms)'Computed columns use simple expressions:
bindu query logs.bindu --select 'ts, amount_cents / 100.0 as amount_usd'The --group-by flag
Section titled “The --group-by flag”Group and aggregate:
bindu query logs.bindu \ --select 'service, count(*), p99(latency_ms)' \ --group-by serviceSupported aggregates: count, sum, avg, min, max, p50, p90, p95, p99, approx_count_distinct.
Sorting & limits
Section titled “Sorting & limits”bindu query logs.bindu --order-by 'ts desc' --limit 100--explain
Section titled “--explain”bindu query logs.bindu --where '...' --explainPrints the query plan, including which files/chunks would be scanned and which are pruned by indexes. Use this to validate that a predicate is pushing down as expected.
What this language is not
Section titled “What this language is not”- No joins. Join upstream of Bindu (e.g., in DuckDB reading
.binduvia the adapter). - No subqueries. Same reason.
- No window functions. Same reason.
- No writes. This is a read path. Rewrites happen through
bindu compressorbindu erase.
For full SQL, use the DuckDB bridge: duckdb -c "SELECT ... FROM 'archive.bindu'".