Skip to content

Query language

Bindu accepts a small query language for selecting, filtering, aggregating, searching, and editing compressed files directly — without decompressing them. It’s deliberately narrower than SQL: the goal is predicate pushdown and symbol-space lookups against compressed coordinates, not a full query engine.

This is the user-facing surface of Bindu’s computable property — operating on compressed data without a decompression round-trip.

The canonical demo:

Terminal window
# Find every occurrence of "Alice" across the compressed archive
bindu search archive.bindu --term "Alice"
# Rename Alice to George — everywhere — without unpacking
bindu edit archive.bindu --replace "Alice" --with "George"

The search runs against the symbol table directly. The edit rewrites the affected symbol entries while the rest of the archive is untouched. Both operations pay coordinate-space cost, not full decompress-modify-recompress cost.

The same primitives generalize: search a compressed telemetry archive for “every left turn at 30°”, search a compressed log archive for an error template, search a compressed monorepo for an AST shape. If the thing you’re looking for has a symbolic representation, you can search for it without decompressing.

Literals:

42 -- int
3.14 -- float
"hello" -- string
true -- bool
null
2026-01-23T00:00:00Z -- timestamp (RFC 3339)

Column references:

service
quality.perplexity -- dotted for nested fields
metadata["user_id"] -- bracketed for map keys
tags[0] -- bracketed for array indices

Operators:

== != < <= > >=
&& || !
+ - * / %
in, not in, matches (regex)

Examples:

service == "checkout"
level in ("error", "fatal")
ts >= 2026-01-01 && ts < 2026-02-01
msg matches "timeout.*"
amount_cents > 10000 && service != "marketing"

--where EXPR accepts any boolean expression. Predicate pushdown applies automatically — Bindu skips files, chunks, and columns that can’t produce matches based on Bloom filters and per-chunk min/max statistics.

Terminal window
bindu query logs.bindu --where 'service == "checkout" && level == "error"'

Comma-separated list of columns and expressions:

Terminal window
bindu query logs.bindu --select 'ts, service, msg'
bindu query logs.bindu --select 'service, count(*), avg(latency_ms)'

Computed columns use simple expressions:

Terminal window
bindu query logs.bindu --select 'ts, amount_cents / 100.0 as amount_usd'

Group and aggregate:

Terminal window
bindu query logs.bindu \
--select 'service, count(*), p99(latency_ms)' \
--group-by service

Supported aggregates: count, sum, avg, min, max, p50, p90, p95, p99, approx_count_distinct.

Terminal window
bindu query logs.bindu --order-by 'ts desc' --limit 100
Terminal window
bindu query logs.bindu --where '...' --explain

Prints the query plan, including which files/chunks would be scanned and which are pruned by indexes. Use this to validate that a predicate is pushing down as expected.

  • No joins. Join upstream of Bindu (e.g., in DuckDB reading .bindu via the adapter).
  • No subqueries. Same reason.
  • No window functions. Same reason.
  • No writes. This is a read path. Rewrites happen through bindu compress or bindu erase.

For full SQL, use the DuckDB bridge: duckdb -c "SELECT ... FROM 'archive.bindu'".