stargz-snapshotter icon indicating copy to clipboard operation
stargz-snapshotter copied to clipboard

Optimize Startup Performance for Shared Files Across Images

Open sevetseh28 opened this issue 11 months ago • 1 comments

Hi,

I’m trying to further improve container startup performance. Using Stargz Snapshotter with eStargz builds has already provided significant improvements, but I would like to optimize it further, considering that all my images share some common files.

Context:

  • Each image represents a different environment, but they all run the exact same HTTP server.
  • Before the container starts running, I check that the server is ready by curling it repeatedly until I receive a 200 OK response.

What I’ve Tried:

  1. ctr-remote optimize with cat prefetching
  • I used ctr-remote optimize to cat the HTTP server’s files, but the startup is faster but is still not as fast as expected.
  • I assume this is because the files are still being lazily loaded.
  1. Mounting the shared files from a Docker volume. I tried mounting the HTTP server’s files via a Docker volume to ensure they are readily available. However, it still seems like the files are being lazily loaded.

Question:

How can I further optimize startup performance, considering that all my images have some directories in common that I already know of?

  • Is there a way to explicitly specify certain files to be fetched and kept ready on disk before the container starts?
  • Would using prefetch directives or other techniques help in ensuring these files are fully loaded before execution?

Any insights or best practices would be greatly appreciated!

Thanks!

sevetseh28 avatar Mar 05 '25 11:03 sevetseh28

Sorry for the slow reply.

Is there a way to explicitly specify certain files to be fetched and kept ready on disk before the container starts?

You can observe the candidates of prioritized files using --record-out and edit the list if needed. https://github.com/containerd/stargz-snapshotter/blob/1c4bf9447105daf8bf5f6624e4c67d1a046b2d87/docs/ctr-remote.md#dump-log-of-accessed-files-during-optimization---record-out

Would using prefetch directives or other techniques help in ensuring these files are fully loaded before execution?

If you can modify the image, you can just store these common files in the same base layer so that the snapshotter can share that layer on the node. If not, maybe we need to modify our caching feature to allow sharing file data among layers.

ktock avatar Apr 08 '25 13:04 ktock