CI runner with flaky network
CI runners typically have small ephemeral disks, flaky outbound network (especially in self-hosted runners behind corporate proxies), and a strong preference for fail-fast. A job that silently produces a different artifact is worse than a job that errors clearly.
peel addresses all three:
- Bounded compressed-side disk usage via the sliding lookahead window.
- Resume on transient network failure without losing the partial download.
--strict-formatand--sha256turn upstream drift into a clear exit code 1 rather than a degraded build.
GitHub Actions example
name: ml-test
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install peel
run: cargo install peel-rs --locked
- name: Hydrate model fixtures
run: |
peel \
https://fixtures.example.com/models-v3.tar.zst \
--sha256 ${{ vars.MODELS_SHA256 }} \
--strict-format \
--max-disk-buffer 512MiB \
-o ./fixtures/
- name: Run tests
run: cargo test --release
Flag behavior:
--sha256 ${{ vars.MODELS_SHA256 }}: the expected hash is in the repo's Actions variables, so a wrong-fixture upload fails CI immediately. The hash is in version control implicitly. It ratchets forward as the team uploads new fixtures and updates the variable.--strict-format: if the upstream URL ever serves a different shape (e.g. a 404 HTML page with a 200 status code from a misbehaving proxy), the run fails clearly instead of producing a corrupt fixtures directory.--max-disk-buffer 512MiB: GitHub-hosted runners have ~14 GB free. Capping the lookahead avoids transient disk pressure during hydration.
GitLab CI example
test:
stage: test
image: rust:1.93
before_script:
- cargo install peel-rs --locked
script:
- >
peel
"$FIXTURE_URL"
--sha256 "$FIXTURE_SHA256"
--strict-format
--max-disk-buffer 256MiB
-o ./fixtures/
- cargo test --release
variables:
FIXTURE_URL: https://fixtures.example.com/models-v3.tar.zst
FIXTURE_SHA256: ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad
cache:
paths:
- fixtures.peel.part
- fixtures.peel.ckpt
The cache directive retains fixtures.peel.part and
fixtures.peel.ckpt between runs. If a previous run was interrupted
partway through (timeout, runner restart), the next run resumes from
the checkpoint, saving network bandwidth and wall-clock on every retry.
Self-hosted runner behind a corporate proxy
A frequent CI failure mode is a self-hosted runner behind an HTTPS
proxy that does TLS termination with its own CA. peel honours
SSL_CERT_FILE:
- name: Hydrate fixtures
env:
HTTPS_PROXY: https://proxy.corp.example.com:8443
SSL_CERT_FILE: /etc/ssl/certs/corp-bundle.pem
run: peel "$FIXTURE_URL" --sha256 "$FIXTURE_SHA256" -o ./fixtures/
If the proxy mangles HTTP/2 (the most common cause of intermittent hydration failures on locked-down corporate networks), force HTTP/1.1:
run: peel "$FIXTURE_URL" --http-version h1 -o ./fixtures/
Caching the extracted output
If the CI cache supports it, cache the extracted output directly rather than only the sidecars:
- uses: actions/cache@v4
with:
path: ./fixtures/
key: fixtures-${{ vars.MODELS_SHA256 }}
- if: steps.cache.outputs.cache-hit != 'true'
run: peel "$URL" --sha256 "$SHA256" -o ./fixtures/
Hydration runs only when the cache misses. The cache key includes the SHA-256, so an updated fixture set automatically invalidates the cache.
Failing the build on upstream drift
The combination of --sha256 + --strict-format is the strongest
guarantee:
| Failure | --sha256 catches? | --strict-format catches? |
|---|---|---|
| Upstream re-uploaded a corrupted file | ✓ | |
| Upstream serves a 200 status on a 404 HTML body | ✓ | |
Upstream changed the format (.tar.zst → .tar.gz) | ✓ | |
| Upstream re-uploaded a legitimately-different file | ✓ | |
| Mirror is serving stale content | ✓ |
Use both in CI. Omit them only when downloading a non-deterministic resource by intent.
Comparison with actions/cache
If the CI has a well-managed artifact cache (sized, verified,
mirrored), and the archive is small enough that download time is not
a concern, actions/cache (or actions/restore-cache, or the CI's
equivalent) is simpler. peel is preferable when:
- The archive is large enough that hydration time matters.
- End-to-end verification of the source is required, not just the cache.
- The CI's cache TTL is shorter than the fixture's lifetime, so cache misses force a re-hydration where bounded disk and resume matter.
- Integration is from outside the CI (e.g. a pre-job step in a test orchestrator that lacks CI-native caching).
Exit code handling
CI scripts want to distinguish "fixture hydration failed transiently" from "fixture is wrong":
#!/usr/bin/env bash
set -u
peel "$URL" --sha256 "$SHA256" --strict-format -o ./fixtures/
rc=$?
case "$rc" in
0) echo "fixtures ready"; exit 0 ;;
1)
# Generic failure: could be transient network, disk full, hash mismatch.
# Check stderr to distinguish. For CI, retry once.
echo "first attempt failed; sleeping 10s then retry"
sleep 10
peel "$URL" --sha256 "$SHA256" --strict-format -o ./fixtures/
;;
*)
echo "peel failed with $rc; not retrying"
exit "$rc"
;;
esac
See Exit codes for the full list.