vs xz / LZMA
xz (using LZMA2) optimizes for ratio above all else. It’s slow — both to compress and, notably, to decompress — but on text-like data it historically achieved among the best ratios of any widely deployed byte compressor. It’s the closest competitor for “best ratio on a general-purpose corpus.”
- xz: LZMA2. Large sliding window (up to 4 GB), range coding, heavy context modeling.
- Bindu: symbolic pipeline; coordinates and deltas rather than byte references.
Ratio (measured)
Section titled “Ratio (measured)”From the industry benchmark:
| Aggregate | xz | Bindu |
|---|---|---|
| All files compressed (% reduction) | 76.15% | 77.95% |
| Per-file wins | 7/30 | 19/30 |
xz is the strongest competitor on this corpus — it wins more files than any other codec we measured. Bindu still wins more files overall, but xz’s wins are decisive on a handful of file types: x86 binaries, certain hyperspectral cubes, and AIS records.
Selected workloads:
| Workload | xz -9e | Bindu |
|---|---|---|
| Hutter enwik9 (1 GB Wikipedia) | 4.69× | 5.43× |
Silesia webster | 4.95× | 5.75× |
Silesia nci | 23.15× | 24.79× |
| GOES-16 weather telemetry | 17.79× | 21.98× |
| OMNI solar wind timestamps | 3.70× | 2,349× |
Silesia mozilla (x86 binary) | 3.83× | 2.88× |
Silesia samba (source) | 5.78× | 5.03× |
Silesia sao (sparse astronomical) | 1.64× | 1.44× |
The headline: on telemetry, sequential, and structured data Bindu pulls ahead, often substantially. On x86 binaries and a few specific encodings xz holds the edge.
xz’s speed profile is its main drawback. xz -9e compresses at ~1–2 MB/s and decompresses at ~75 MB/s on a single core. Bindu encode is in the same general range as xz; decode is faster. Crucially, for many Bindu workloads you don’t decompress at all — you operate on the compressed form, which sidesteps xz’s slow-decode penalty entirely.
When xz wins
Section titled “When xz wins”- One-off cold archives where compression time and read time both don’t matter.
- Source tarballs and OS packages, where xz has entrenched ecosystem support.
- Workloads dominated by x86 binaries, where xz’s executable filters are tuned specifically for that case.
- General-purpose corpora where you want a single fixed codec rather than a tunable system.
When Bindu wins
Section titled “When Bindu wins”- Sequential telemetry / satellite — see the flagship use case.
- Workloads where you read or query the compressed form repeatedly.
- Anywhere the symbolic representation pays off downstream of compression — search, edit, cross-file comparison.