Query language

Bindu accepts a small query language for selecting, filtering, aggregating, searching, and editing compressed files directly — without decompressing them. It’s deliberately narrower than SQL: the goal is predicate pushdown and symbol-space lookups against compressed coordinates, not a full query engine.

This is the user-facing surface of Bindu’s computable property — operating on compressed data without a decompression round-trip.

Search and edit without decompressing

The canonical demo:

# Find every occurrence of "Alice" across the compressed archive
bindu search archive.bindu --term "Alice"

# Rename Alice to George — everywhere — without unpacking
bindu edit archive.bindu --replace "Alice" --with "George"

The search runs against the symbol table directly. The edit rewrites the affected symbol entries while the rest of the archive is untouched. Both operations pay coordinate-space cost, not full decompress-modify-recompress cost.

The same primitives generalize: search a compressed telemetry archive for “every left turn at 30°”, search a compressed log archive for an error template, search a compressed monorepo for an AST shape. If the thing you’re looking for has a symbolic representation, you can search for it without decompressing.

Expressions

Literals:

42         -- int
3.14       -- float
"hello"    -- string
true       -- bool
null
2026-01-23T00:00:00Z   -- timestamp (RFC 3339)

Column references:

service
quality.perplexity       -- dotted for nested fields
metadata["user_id"]      -- bracketed for map keys
tags[0]                  -- bracketed for array indices

Operators:

== != < <= > >=
&& || !
+ - * / %
in, not in, matches (regex)

Examples:

service == "checkout"
level in ("error", "fatal")
ts >= 2026-01-01 && ts < 2026-02-01
msg matches "timeout.*"
amount_cents > 10000 && service != "marketing"

The `--where` flag

--where EXPR accepts any boolean expression. Predicate pushdown applies automatically — Bindu skips files, chunks, and columns that can’t produce matches based on Bloom filters and per-chunk min/max statistics.

bindu query logs.bindu --where 'service == "checkout" && level == "error"'

The `--select` flag

Comma-separated list of columns and expressions:

bindu query logs.bindu --select 'ts, service, msg'
bindu query logs.bindu --select 'service, count(*), avg(latency_ms)'

Computed columns use simple expressions:

bindu query logs.bindu --select 'ts, amount_cents / 100.0 as amount_usd'

The `--group-by` flag

Group and aggregate:

bindu query logs.bindu \
  --select 'service, count(*), p99(latency_ms)' \
  --group-by service

Supported aggregates: count, sum, avg, min, max, p50, p90, p95, p99, approx_count_distinct.

Sorting & limits

bindu query logs.bindu --order-by 'ts desc' --limit 100

`--explain`

bindu query logs.bindu --where '...' --explain

Prints the query plan, including which files/chunks would be scanned and which are pruned by indexes. Use this to validate that a predicate is pushing down as expected.

What this language is not

No joins. Join upstream of Bindu (e.g., in DuckDB reading .bindu via the adapter).
No subqueries. Same reason.
No window functions. Same reason.
No writes. This is a read path. Rewrites happen through bindu compress or bindu erase.

For full SQL, use the DuckDB bridge: duckdb -c "SELECT ... FROM 'archive.bindu'".