peel
Streaming, resumable, space-efficient extractor for compressed archives over HTTP, and for local archive files on disk.
peel https://example.com/dataset.tar.zst
peel is a Rust CLI that downloads, decompresses, and extracts an archive in
a single pass. It resumes exactly where it left off after a dropped
connection, kill -9, OOM kill, or power loss. The compressed bytes never
fully land on disk: as the decoder consumes the prefix, the download buffer
underneath is hole-punched out. The archive and the extracted tree never
coexist at full size.
What it solves
- Disk pressure. Pulling a 40 GB
.tar.zstshould not require 80 GB free. Peak disk usage is roughlyextracted_size + a few hundred MB, notcompressed_size + extracted_size. - Flaky networks. A dropped connection mid-download is the default case,
not the edge case.
peelresumes at the byte that was in flight. kill -9and pod restarts. Frame-aligned checkpoints (atomicwrite+fsync+rename) plus per-chunk fingerprints ensure a hard kill mid-extraction resumes exactly where it left off, byte-identical to a clean run.- Streaming
.zip,.7z,.rarover HTTP.curl | unzipdoes not work: the ZIP central directory lives at the end of the file, the 7z trailer pointer sits at the end of the file, andunrarrequireslseekon its input.peelissues a ranged GET for the central directory or trailer first (zip, 7z), or walks the RAR header chain in stream order (rar), then streams entries to disk as soon as their bytes arrive.
Format coverage at a glance
| Family | Formats |
|---|---|
| Plain | .tar |
| Streaming codecs | .zst / .tar.zst · .xz / .tar.xz · .lz4 / .tar.lz4 · .gz / .tar.gz |
| Random-access archives | .zip · .7z · .rar (RAR5 + legacy RAR3/RAR4) |
Encrypted archives are supported for zip (WinZip-AES, ZipCrypto), 7z (AES-256-CBC), and rar5 (AES-256-CBC, both archive-header and per-file). See Encrypted archives.
The full per-format matrix (magic-byte detection, resume granularity, encryption) is on the Supported formats page.
Distinguishing features
-
Hole-punched compressed buffer. Parallel ranged HTTP downloads feed a sparse part-file. The decoder consumes the prefix while workers continue to fetch the suffix, and finished bytes are released back to the filesystem as the decoder advances. Peak compressed-side disk usage is the download window (approximately
--max-disk-buffer), not the archive size. -
Frame-aligned, byte-identical resume. A
kill -9anywhere in the pipeline leaves a.peel.ckptnext to the part file. Re-running the same command picks up exactly at the checkpointed frame. The final output is byte-identical to a clean run. The crash-test harness runs 100 random kill points per format and asserts that property every time. -
One command for HTTP and local. A URL argument triggers parallel ranged GETs and streaming extract. A local file argument runs the same hand-rolled decoders against the file on disk: non-destructive by default, with hole-punching enabled via
-d.
Where to next
- Getting started: Installation and Quick start.
- Full flag listing: CLI reference.
- Specific features: Multi-volume archives, Encrypted archives, Performance and tuning, Checkpoint and resume.
- Pipeline integration: the worked examples cover Kubernetes init containers, CI runners, and an Arbitrum snapshot bundle.