skops Feature request: memmap support when loading models

Make it possible to unzip and memmap the numpy arrays in a .skops file to make it possible to load models with their large parameter arrays in shared memory between several Python processes concurrently on a given host.

This would make it possible to:

reduce memory needs when handling concurrent requests for a server API running on several Python workers for instance using gunicorn or uwsgi,
reduce memory usage when batch processing a large number of prediction tasks on a cluster system with several Python workers per host, for instance with dask, ray, pyspark and co.
make it faster to load a model by benefiting from the warm starting effect of a previously memmaped model for instance in a serverless environment, assuming several concurrent "function calls" can share a common working directory on a given host.

Nov 29 '22 18:11 ogrisel

Yep, that sounds reasonable. I haven't done this though. Could you please give me a pointer? Maybe the place where you do this in joblib?

Nov 30 '22 08:11 adrinjalali

In joblib this is too complex because we have all the numpy buffers in the main pickle stream. In skops this is simpler, it would be a matter of:

unzipping the ".skops" file into a folder on the local filesystem;
calling numpy.load with mmap_mode="r" or mmap_mode="c" instead of the regular numpy.load used in https://github.com/skops-dev/skops/blob/main/skops/io/_numpy.py#LL74C23-L74C30.

Nov 30 '22 09:11 ogrisel

Very good suggestion, thanks.

I have no experience with memmapping, but if someone else takes a stab at this feature, feel free to ignore my questions :)

How would we test that it actually works? I imagine that's not trivial.
AFAICT, the option doesn't exist for scipy sparse matrices, or is there something?
The link points to something unrelated (not a perma link), I guess it was meant to highlight: https://github.com/skops-dev/skops/blob/7242265eb7744b8895f2f0c7cb62f59d0a656eb6/skops/io/_numpy.py#LL79C23-L79C30

Dec 01 '22 11:12 BenjaminBossan

@ogrisel I have been thinking about this, and there's an issue because of which I've been putting this aside.

When it comes to loading from multiple processes, each process would try to unzip the file, we'd need to check if it's already done and use that, which we might be able to figure out a way to do so, but not straightforward I think.

When we unzip the files onto disk, they're gonna stay there and we never would clean them up. In the long run on non-linux machines this is kinda bad (at list on Windows I don't think temp files are ever cleaned up). This is the main issue I'm not sure how to solve.

Jan 24 '23 14:01 adrinjalali

It would be possible to register those folders for automated garbage collection with the loky ressourcetracker.

The standard library also has a ressource tracker in the concurrent futures module but if i remember correctly it does not come with a clean up function for folders.

Jan 24 '23 20:01 ogrisel

The class I had in mind is https://github.com/joblib/loky/blob/047d80623b7cf2d43500ed56ee0320b3e41c3f81/loky/backend/resource_tracker.py. However this is not considered public API, so maybe it's a bad idea to rely on this for skops.

It also exists in the multiprocessing.resource_tracer module of the standard library, but we would need to register new cleanup function for file and folder resource types which are not part of the default _CLEANUP_FUNCS dict in multiprocessing. Unfortunately, this dict is private and apparently there is no public API to add new functions to that dict as far as I know. So again, this might be risky to rely on this from a maintenance point of view.

Maybe the safest solution is to not try to be too clever and let the user handle the clean-up explicitly by themselves depending on their application lifecycle.

For the original unzipping, maybe the skops could write a special sentinel emtpy file name ready_to_load once the unzipping process has completed. If a concurrent process is told to unzip and load a file to a folder that already exists but does not have the ready sentinel flag, then skops could either:

wait and retry later, until the ready_to_load sentinel file appears,
or load the .skops file in memory, without attempting to memmap and issue a warning to the user telling them that memmapping failed because the target folder wasn't ready.

Mar 14 '23 16:03 ogrisel

skops skops copied to clipboard

Feature request: memmap support when loading models

skops
skops copied to clipboard