oras-py icon indicating copy to clipboard operation
oras-py copied to clipboard

Data loss: oras-py strips paths to basename → filename collisions on pull

Open vsbuffalo opened this issue 3 months ago • 1 comments

Summary

oras-py sets layer titles via os.path.basename(...), unlike ORAS CLI which preserves relative paths (org.opencontainers.image.title). Files sharing a basename (e.g., src/model.py & lib/model.py) collide on pull → one overwrites the other.

Code pointer

The issue in code is in provider.py:

blob_name = os.path.basename(blob)  # drops directories
layer["annotations"] = {oras.defaults.annotation_title: blob_name.strip(os.sep)}

Reproducible Example

This shows the behavior from the ORAS command line tool:

docker run -d -p 5555:5000 --name registry registry:2
mkdir -p src lib && echo src > src/model.py && echo lib > lib/model.py
oras push localhost:5555/cli-ok:v1 src/model.py lib/model.py   # titles keep paths

But the Python client overwrites:

from oras.client import OrasClient
c = OrasClient(hostname="localhost:5555")
c.push(files=["src/model.py","lib/model.py"], target_ref="py-bug:v1")  # titles = "model.py"
files = c.pull("localhost:5555/py-bug:v1")  # only one model.py remains

Expected

  • Titles: src/model.py, lib/model.py; pull restores both.

Actual

  • Titles: model.py, model.py; pull keeps one (data loss).

Impact

High: silent overwrite when basenames repeat.

Workaround

Use annotation_file to set full paths in org.opencontainers.image.title.

Proposed fix

  • Preserve relative paths by default (CLI parity); respect annotation_file verbatim.
  • Optionally a preserve_paths=False escape hatch.
  • Add tests; warn/error on basename collisions without annotations.

Env

  • oras-py: 0.2.37
  • Python: 3.11.11
  • OS: Mac OS 15.6
  • Registry: registry:2

vsbuffalo avatar Aug 25 '25 16:08 vsbuffalo

If you've like to contribute the fix, we'd love to have it!

vsoch avatar Aug 25 '25 17:08 vsoch