vs gzip

gzip is the universal default: available everywhere, well understood, and good enough for most text. Here’s how Bindu differs.

Model

gzip: DEFLATE (LZ77 + Huffman) over a byte stream. 32 KB sliding window. No awareness of file structure.
Bindu: symbolic pipeline producing coordinates and deltas. The compressed form is the working representation.

Ratio (measured)

From the industry benchmark — full corpus, 30 files, SHA-256 round-trip verified:

Aggregate	gzip	Bindu
All files compressed (% reduction)	64.51%	77.95%
Per-file wins	0	19/30

On individual workloads:

Workload	gzip -9 ratio	Bindu ratio
Silesia `webster` (text)	3.44×	5.75×
Silesia `xml`	8.07×	12.63×
Silesia `nci` (structured)	11.23×	24.79×
GOES-16 weather telemetry	12.30×	21.98×
MMS mission status flags	1,025×	263,314×
Hutter Prize enwik9 (1 GB Wikipedia)	3.10×	5.43×

Bindu wins decisively on structured and sequential data. On general-purpose mixed corpora, the gap is smaller; on already-compressed media or random bytes, both produce ~1×.

Capability differences

Capability	gzip	Bindu
Read compressed form without decode	No	Yes
Search the compressed file directly	No	Yes
Edit a region in place	No	Yes
Cross-file comparison without decompress	No	Yes
Tunable per workload	No	Yes
Format-agnostic (works everywhere)	Yes	Partial

When gzip is still the right choice

You need a format every tool understands today, with no install dependency.
You’re compressing transient transport payloads (HTTP responses) where decode latency dominates.
The data is already-compressed media, encrypted, or otherwise high-entropy — both will produce ~1× and gzip’s tooling reach is the deciding factor.

When to prefer Bindu

Sequential, structured, telemetry-style data — see the satellite & telemetry use case.
Workloads where the read path matters: search, query, or operate on the compressed form rather than just store and retrieve.
Long-retention archives where the storage and downstream-compute savings amortize.