oras-py
oras-py copied to clipboard
Data loss: oras-py strips paths to basename → filename collisions on pull
Summary
oras-py sets layer titles via os.path.basename(...), unlike ORAS CLI which
preserves relative paths (org.opencontainers.image.title). Files sharing a
basename (e.g., src/model.py & lib/model.py) collide on pull → one
overwrites the other.
Code pointer
The issue in code is in provider.py:
blob_name = os.path.basename(blob) # drops directories
layer["annotations"] = {oras.defaults.annotation_title: blob_name.strip(os.sep)}
Reproducible Example
This shows the behavior from the ORAS command line tool:
docker run -d -p 5555:5000 --name registry registry:2
mkdir -p src lib && echo src > src/model.py && echo lib > lib/model.py
oras push localhost:5555/cli-ok:v1 src/model.py lib/model.py # titles keep paths
But the Python client overwrites:
from oras.client import OrasClient
c = OrasClient(hostname="localhost:5555")
c.push(files=["src/model.py","lib/model.py"], target_ref="py-bug:v1") # titles = "model.py"
files = c.pull("localhost:5555/py-bug:v1") # only one model.py remains
Expected
- Titles:
src/model.py,lib/model.py; pull restores both.
Actual
- Titles:
model.py,model.py; pull keeps one (data loss).
Impact
High: silent overwrite when basenames repeat.
Workaround
Use annotation_file to set full paths in org.opencontainers.image.title.
Proposed fix
- Preserve relative paths by default (CLI parity); respect
annotation_fileverbatim. - Optionally a
preserve_paths=Falseescape hatch. - Add tests; warn/error on basename collisions without annotations.
Env
- oras-py: 0.2.37
- Python: 3.11.11
- OS: Mac OS 15.6
- Registry:
registry:2
If you've like to contribute the fix, we'd love to have it!