spaCy icon indicating copy to clipboard operation
spaCy copied to clipboard

Use mmap to share models across processes and speed up loading

Open alexgarel opened this issue 3 years ago • 2 comments

Would'nt usage of Mmap could really speed up data loading and reduce memory usage in multiprocess environment ?

  1. In our webserver we use different process. Each process load three languages models and take a good chunk of memory !

  2. While developing with Django, django constantly restart, making our object models load each time we need them (and we them a lot for some functionality). This take a while. (with mmap, main process could keep file mapped in memory)

It seems to me that :

  1. the model does not move in a typical environment, so a read-only mmap access is ok
  2. mmap would speed-up new process model loading (already in memory)
  3. mmap would only use one image of the model (shared memory)

Maybe there are some technical difficulties (I don't know about low level representation of models in spacy) but it seems worth it if its feasible. (if it needs a specific non compressed on disk format to be able to mmap, it maybe ok however)

This feature request was already submitted https://github.com/explosion/spaCy/issues/100 but it was in the old time, I imagine it's worth thinking again ?

Your Environment

  • Operating System: Debian 10
  • Python Version Used: Python 3.7
  • spaCy Version Used: 2.1.3
  • Environment Information:

alexgarel avatar Jan 21 '21 16:01 alexgarel

Here is a related discussion: https://github.com/explosion/spaCy/discussions/5051

adrianeboyd avatar Jan 21 '21 19:01 adrianeboyd

In principle I'm in favour of this, and we've looked into it at various points. However, in practice it's a relatively difficult optimisation to carry and maintain.

The general problem is that mmap is a low-level optimization that needs to happen when the data is loaded, so the change touches lots of different bits of code. There's only a benefit if we don't operate on the array (which requires reading it into memory), and it can complicate the CPU/GPU code too. We also have to watch out for different behaviours across platforms.

honnibal avatar Jan 21 '21 23:01 honnibal