Skip to content

Comments

feat: chunked image push via OCI layout#2760

Draft
markphelps wants to merge 13 commits intomainfrom
mphelps/chunked-image-push
Draft

feat: chunked image push via OCI layout#2760
markphelps wants to merge 13 commits intomainfrom
mphelps/chunked-image-push

Conversation

@markphelps
Copy link
Contributor

@markphelps markphelps commented Feb 23, 2026

Summary

Replace Docker's monolithic ImagePush with chunked layer uploads, gated behind COG_PUSH_OCI=1. The image is exported from Docker via ImageSave, converted to an OCI layout using go-containerregistry, then each layer is pushed through the existing RegistryClient.WriteLayer chunked upload path — the same infrastructure weight artifacts already use. This bypasses the ~500MB request body limit that blocks Docker's native push for large layers.

Off by default. Set COG_PUSH_OCI=1 to enable. Falls back to Docker push on any non-fatal error (except context cancellation/timeout), so there's zero regression risk.

What changed

Chunked image push

  • ImagePusher tries OCI chunked push first when COG_PUSH_OCI=1, falls back to Docker push
  • Exports image via docker.ImageSave() → tar → OCI layout → concurrent layer upload
  • shouldFallbackToDocker() only blocks fallback on context cancellation/timeout — no string-based error matching
  • Added ImageSave(ctx, imageRef) to the Docker Command interface

Unified push infrastructure

  • Single PushProgress type replaces separate image/weight progress types
  • writeLayerWithProgress() helper deduplicates progress channel boilerplate
  • GetPushConcurrency() (default 4, COG_PUSH_CONCURRENCY) shared by image layers, weight pushes, and CLI progress
  • BundlePusher.pushWeights() now has a concurrency limit (was unlimited)

Progress rendering

  • Replaced ~200 lines of custom ANSI escape rendering in pkg/cli/weights.go with the mpb library (already a dependency, was unused)
  • Handles TTY detection, cursor management, multi-bar rendering, and size formatting

Registry config

  • Extracted getChunkSize() and getMultipartThreshold() env var helpers into pkg/registry/config.go

Cleanup

  • Deleted pkg/oci/ package — OCI layout logic inlined into ImagePusher
  • Deleted tools/uploader/ — unused S3 multipart uploader (363 lines, zero imports)
  • Removed Pusher interface, OCIImagePusher, pushImageWithFallback() — consolidated into single ImagePusher
  • Unexported DefaultFactorydefaultFactory, NewImagePushernewImagePusher

Environment variables

Variable Default Description
COG_PUSH_OCI unset (off) Set to 1 to enable OCI chunked image push
COG_PUSH_CONCURRENCY 4 Max concurrent layer/weight uploads
COG_PUSH_CHUNK_SIZE 268435456 (256 MB) Chunk size in bytes for multipart uploads
COG_PUSH_MULTIPART_THRESHOLD 52428800 (50 MB) Blobs above this size (bytes) use chunked upload

Replace Docker's monolithic ImagePush with a chunked push path for
container image layers. Images are exported from the Docker daemon to
OCI layout via ImageSave, then pushed through the registry client's
existing chunked upload infrastructure (WriteLayer with 256MB chunks).

This bypasses the ~500MB Cloudflare Workers request body limit that
blocks Docker's native push for large layers.

Key changes:
- Add OCIImagePusher to pkg/registry/ with concurrent layer uploads
- Export images from Docker daemon to OCI layout via ImageSave + tarball
- Integrate into Resolver.Push and BundlePusher with Docker push fallback
- Add ImageSave method to command.Command interface
- Delete unused tools/uploader/ S3 multipart code (363 lines)
…pkg/model

Move OCI layout utilities to pkg/oci/, extract registry transport config
(chunk size, multipart threshold env vars) to pkg/registry/config.go, and
relocate OCIImagePusher to pkg/model/ alongside ImagePusher and WeightPusher.

- pkg/oci/: pure OCI format utilities (Docker tar <-> OCI layout), no registry deps
- pkg/registry/config.go: configurable chunk size and multipart threshold
- pkg/model/oci_image_pusher.go: push orchestration with shared pushImageWithFallback()
- Deduplicate fallback logic between resolver.go and pusher.go
- Add error discrimination: no fallback on auth errors or context cancellation
- Create OCIImagePusher once in NewResolver, not per-push call
…weight pushers

- Unify ImagePushProgress and WeightPushProgress into shared PushProgress type
- Extract writeLayerWithProgress() helper to deduplicate progress channel
  boilerplate between OCIImagePusher and WeightPusher
- Unify push concurrency: both image layer pushes and weight pushes use
  GetPushConcurrency() (default 4, overridable via COG_PUSH_CONCURRENCY)
- Fix BundlePusher.pushWeights() which had no concurrency limit (launched
  all goroutines at once); now uses errgroup.SetLimit
- Implement auth error detection in shouldFallbackToDocker() to match its
  documented behavior (don't fall back on UNAUTHORIZED/DENIED errors)
String-based error detection is fragile. Fall back to Docker push on any
error except context cancellation/timeout.
Replace ~200 lines of custom ANSI escape progress rendering with the mpb
(multi-progress-bar) library, which was already a dependency but unused.

mpb handles TTY detection, cursor management, concurrent bar updates, and
size formatting natively. Retry status is shown via a dynamic decorator.
…etTotal completion

When bars are created with total > 0, mpb sets triggerComplete=true
internally. This causes SetTotal(n, true) to early-return without
triggering completion, so bars never finish and p.Wait() deadlocks.

Creating bars with total=0 leaves triggerComplete=false, allowing
explicit completion via SetTotal(current, true) after push finishes.
The real total is still set dynamically via ProgressFn callbacks.
@markphelps markphelps force-pushed the mphelps/chunked-image-push branch from 946bbd5 to 8b01c53 Compare February 23, 2026 20:43
Merge OCIImagePusher (OCI chunked push) and the old ImagePusher (Docker
push) into a single ImagePusher type that tries OCI first and falls back
to Docker push on non-fatal errors.

- ImagePusher.Push() handles OCI→Docker fallback internally
- Delete OCIImagePusher type and oci_image_pusher.go
- BundlePusher takes *ImagePusher directly instead of separate oci/docker pushers
- Resolver stores single imagePusher field instead of ociPusher
- Remove dead Pusher interface
- Consolidate tests into image_pusher_test.go
@markphelps markphelps force-pushed the mphelps/chunked-image-push branch from 8b01c53 to 88ff363 Compare February 23, 2026 20:45
markphelps and others added 4 commits February 23, 2026 16:03
ImagePusher now calls p.docker.ImageSave() directly instead of going
through the oci.ImageSaveFunc indirection. The OCI layout export logic
is inlined into ImagePusher.ociPush(). The pkg/oci package is deleted
entirely since it had no other consumers.
OCI push is now opt-in rather than always-on when a registry client
is present. Requires COG_PUSH_OCI=1 to activate.
Signed-off-by: Mark Phelps <mphelps@cloudflare.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant