oras icon indicating copy to clipboard operation
oras copied to clipboard

Error: Error: invalid OCI Image Index: failed to decode index file: invalid character 'm' after object key

Open ArghyaChakraborty opened this issue 2 months ago • 7 comments

What happened in your environment?

We use global oras cache. While doing oras manifest fetch or oras pull operation, we sometime get this error:

Error: Error: invalid OCI Image Index: failed to decode index file: invalid character 'm' after object key

We found that the index.json file in the oras cache is an invalid json file. It has an entry like following:

....,{"med{"mediaType":"application/vnd.oci.image.manifest.v1+json","digest":"sha256:8acfe","size":1973},......

Which is what is resulting in the error. We want to understand why oras cache will become corrupted like this and what we could do to avoid it.

What did you expect to happen?

We want to use global oras cache and we want every manifest fetch and pull operation to succeed.

How can we reproduce it?

It is hard to reproduce. We use AWS Lambda functions to perform oras operations and multiple lambdas can run in parallel. Lambdas have an EFS mount and the oras cache is created in EFS.

What is the version of your ORAS CLI?

Version: 1.3.0-beta.4 Go version: go1.24.5 OS/Arch: linux/amd64 Git commit: 582ae373493985fb8cda60a4306b52d4e9f70b8e Git tree state: clean

What is your OS environment?

AWS Lambda Runtime Node.js 22.x (Architecture: x86_64)

Are you willing to submit PRs to fix it?

  • [ ] Yes, I am willing to fix it.

ArghyaChakraborty avatar Oct 30 '25 20:10 ArghyaChakraborty

Sounds like a file lock issue that something like the gofrs/flock would help with.

TerryHowe avatar Oct 30 '25 21:10 TerryHowe

Thanks for the comment @TerryHowe ... but how do we incorporate the gofrs/flock in oras pull/push/manifest fetch workflow, especially when it comes to oras cache ?

ArghyaChakraborty avatar Oct 31 '25 19:10 ArghyaChakraborty

I'm looking into this. @ArghyaChakraborty Is the EFS you mentioned in the issue Elastic File System of AWS? Is it a file storage service of AWS?

wangxiaoxuan273 avatar Nov 04 '25 08:11 wangxiaoxuan273

EFS is a volume that could be shared across compute instances. Shared block storage.

I wouldn't think that would be a factor in the problem although more likely to occur if you are running on multiple instances.

TerryHowe avatar Nov 04 '25 12:11 TerryHowe

We need more information to further troubleshoot. As a temporary workaround, you can remove the bad index.json file when this error occurs (for example, within the script that runs oras) and oras will create a new index.json. Please feel free to share any additional details if you have them. @ArghyaChakraborty

wangxiaoxuan273 avatar Nov 05 '25 06:11 wangxiaoxuan273

May be related #1870

wangxiaoxuan273 avatar Nov 05 '25 07:11 wangxiaoxuan273

I don't know all the details about flock, but I think it is file based and would work across a volume mounted across multiple machines. I think there would be a pretty big performance hit using flock.

In general it seems like using a cache on an EFS would not be a good idea. EFS would be slower than local storage.

TerryHowe avatar Nov 05 '25 11:11 TerryHowe