Cached Assets¶
Container-magic can download external resources (files, models, datasets) and cache them locally to avoid re-downloading on every build. Assets are defined at the root level of cm.yaml and copied into the image using copy: steps.
Use cases:
- Machine learning models from HuggingFace or other sources
- Large datasets
- Pre-compiled binaries or libraries
- Configuration files from remote sources
Configuration¶
Define assets at the root level of your cm.yaml:
names:
image: my-project
user: root
assets:
- https://example.com/model.tar.gz
- my-model.bin: https://huggingface.co/bert-base/resolve/main/model.safetensors
Each asset can be either:
- A bare URL - the filename is derived from the URL path
- A
filename: urlmapping - you choose the local filename
Then use copy: steps to place them in the image:
stages:
base:
from: python:3-slim
steps:
- copy: model.tar.gz /models/model.tar.gz
- copy: my-model.bin /models/bert.safetensors
How It Works¶
- Run
cm build- assets are downloaded (if not cached) - Files cached in
.cm-cache/assets/<hash>/with metadata - Use
copy:steps to place cached files into the image - Subsequent builds reuse cached files, skipping downloads
Cache Management¶
cm cache list # List cached assets with size and URL
cm cache path # Show cache directory location
cm cache clear # Clear all cached assets
Example: ML Model in Production Image¶
names:
image: ml-service
user: nonroot
assets:
- model.bin: https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/pytorch_model.bin
stages:
base:
from: pytorch/pytorch:latest
steps:
- pip:
install:
- transformers
- flask
- copy: model.bin /models/model.bin
development:
from: base
production:
from: base
steps:
- copy: workspace
Multiple Assets¶
names:
image: ml-pipeline
user: nonroot
assets:
- tokenizer.json: https://example.com/tokenizer.json
- model.safetensors: https://example.com/model.safetensors
- config.json: https://example.com/config.json
stages:
base:
from: pytorch/pytorch:latest
steps:
- copy:
- tokenizer.json /models/tokenizer.json
- model.safetensors /models/model.safetensors
- config.json /models/config.json
development:
from: base
production:
from: base
The copy: step accepts a list to copy multiple files. Each item follows the same source dest format.