wasmer
wasmer copied to clipboard
Introduce a general caching abstraction to wasix
Motivation
There are many places across the Wasmer CLI and WASIX where we want to use caching to avoid unnecessary work. At the moment, each of these places is hand-rolling its own caching solution.
Off the top of my head, I can think of the following:
-
wasmer_wasix::runtime::resolver::WapmSource
- caches the response of aGetPackage()
query against the registry -
wasmer_wasix::runtime::module_cache
- contains various implementations of caches for WebAssembly modules (in-memory, thread-local, filesystem, etc.) -
wasmer_wasix::runtime::package_loader::BuiltinPackageLoader
- caches*.webc
files downloaded from the registry -
wasmer_wasix::runtime::resolver::WebSource
- caches*.webc
files downloaded from the internet via a bare URL
These caching solutions all tend to have the same properties or requirements:
- It's a key-value store where the keys are arbitrary strings
- We want the files to be cached on disk so they can be picked up from subsequent
wasmer
runs - The values are often "big" and we want to mmap (via
shared_buffers::OwnedBuffer
) them rather than reading into memory - On-disk caches need to be first saved to a temporary file and moved into place to avoid seeing results before they've finished being written
- Cached values should be kept in memory once loaded from the filesystem
- We often want a way to invalidate a cache key, whether that is via a timeout, by checking if an
ETag
header or hash has changed, or whatever - We want the option to use stale values if the "main" method of fetching the data fails (e.g. because of an network error)
- We need to work in both sync and async contexts
Proposed solution
I was thinking of creating a concrete type with an API similar to a HashMap<CollectionName, HashMap<Key, Value>>
, except it'll pass out instances of shared_buffer::OwnedBuffer
and automatically manage the synchronisation of on-disk and in-memory caches.
This would probably hook into Wasmer Edge's caching facilities, too.
Additional context
This originally came up when I was working on #3983. Having one or two places where we do caching is fine, but I noticed I was doing the same in-memory/on-disk dance in several places and all of it is essentially untested.