workerd
workerd copied to clipboard
Python snapshots: Only load dynamic libraries that are needed
Rather than actually preloading all libraries, we just preallocate space for them. There is a function called getMemory
that determines the location of the dynamic libraries and nothing else so we can patch this to ensure that the libraries get allocated in their dedicated location if they are loaded at all. This allows us to ensure that their metadata always lands in the right spot. We also make sure to load all the libraries in the correct order so that they end up in the correct spot in the function pointer table.
There is also the possibility that someone could use ctypes and mess everything up. ctypes also doesn't work with our snapshots before this PR. It could be fixed by patching libffi to record the trampoline address and function table slot into the DSO_METADATA and then recreate them the same way when restoring the snapshot. We'd also need to record the function table base for each loaded library. Once we do all this, we should be safe again...
Since this breaks the memory snapshot format, it's a good time to add a version header to the memory snapshots so we can more easily support multiple versions of them.
I also wonder if we could use capnp for the memory snapshot in JS