Compressing data
This page walks through the standard Bindu workflow on a recognizable file: Alice’s Adventures in Wonderland by Lewis Carroll, taken from the Project Gutenberg public-domain corpus. The same flow applies to any input — telemetry, logs, source code, scientific data — but Alice is small enough to follow along on a laptop and well-known enough to make the search/edit demo on the next page legible.
1. Install Bindu
Section titled “1. Install Bindu”# macOSbrew install bindu-labs/tap/bindu
# Linuxcurl -fsSL https://get.bindu.dev | sh
# Verifybindu --versionFor other platforms see the release page.
2. Get Alice in Wonderland
Section titled “2. Get Alice in Wonderland”curl -O https://www.gutenberg.org/files/11/11-0.txtmv 11-0.txt alice.txt
ls -lh alice.txt# -rw-r--r-- alice.txt 174K3. Compress with Bindu
Section titled “3. Compress with Bindu”bindu compress alice.txtThis produces alice.txt.bindu alongside the original. Bindu auto-detected English narrative prose and routed the input through the corresponding sub-pipeline (see Overview). Output:
alice.txt 174 KBalice.txt.bindu ~46 KB (~3.8× ratio)The exact ratio depends on the configuration; Bindu lands at roughly the same range as the strongest classical codecs on prose-style English text.
4. Compare with gzip and zstd
Section titled “4. Compare with gzip and zstd”gzip --keep alice.txtzstd --keep alice.txt -19 -o alice.txt.zstxz --keep -9e alice.txt
ls -lh alice.txt*Approximate sizes you’ll see (varies slightly by version):
| Codec | Size | Ratio |
|---|---|---|
alice.txt | 174 KB | 1.0× |
| gzip -6 | ~62 KB | 2.8× |
| zstd -19 | ~52 KB | 3.3× |
| xz -9e | ~48 KB | 3.6× |
| Bindu | ~46 KB | 3.8× |
This is the “out of the box on prose” story: Bindu is competitive with — and on this file slightly ahead of — the strongest classical codecs. The bigger wins are on sequential, structured data, not prose. See the satellite & telemetry use case for the workloads where Bindu pulls decisively ahead, and the industry benchmark for the full measured picture.
5. Decompress to verify
Section titled “5. Decompress to verify”bindu decompress alice.txt.bindu --output alice-roundtrip.txtdiff alice.txt alice-roundtrip.txt# (no output — files are identical)
shasum alice.txt alice-roundtrip.txt# matching SHA-1 hashesBindu is lossless. The decompressed file is byte-identical to the original.
What you didn’t have to do
Section titled “What you didn’t have to do”You didn’t supply a schema. You didn’t pick a sub-pipeline. You didn’t specify a level. The selector at the front of the pipeline routed the input through the right path automatically. For tightly-tuned deployments — a satellite that only ever produces one shape of data — you can strip unused sub-pipelines down to a much smaller compressor; for general use, the defaults are the right starting point.
What’s next
Section titled “What’s next”The compression part is the conventional half of Bindu. The unconventional half is what you can do with the compressed file without ever decompressing it. That’s covered in Operating on compressed data.