Search icon indicating copy to clipboard operation
Search copied to clipboard

Package the NER models

Open pafonta opened this issue 4 years ago • 3 comments

🚀 Feature

Package the NER models we trained.

Motivation

Make the NER models pip installable and easily distributable.

Pitch

As we track the models with DVC, we could retrieve them if needed.

However, we might want or need to distribute our models in a packaged form.

Besides, packaging a model would let us distribute with it registered functions and custom components (EntityRuler?).

This issue is a reminder to have this discussion.

Additional context

Reference: https://spacy.io/api/cli#package.

pafonta avatar Mar 23 '21 10:03 pafonta

However, we might want or need to distribute our models in a packaged form.

Currently a spacy pipeline is loaded with a very easy spacy.load() — and this also include the EntityRuler component.

Unless at some point we should have registered functions, is there really an strong benefit from having a model that is pip installable?

FrancescoCasalegno avatar Apr 07 '21 09:04 FrancescoCasalegno

this also include the EntityRuler component

There are 2 pipelines for each modelX. One in data_and_models/models/ner/. One in data_and_models/models/ner_er/. So the EntityRuler is loaded only if one uses the 2nd directory with spacy.load(). Just to clarify that having the EntityRuler loaded is another discussion than packaging the model or not. Or had you something else in mind?

Unless at some point we should have registered functions

That's indeed a case where packaging models would be handy.

is there really an strong benefit from having a model that is pip installable?

I think about 4 benefits:

  1. distribute custom architectures,
  2. distribute custom functions,
  3. distribute custom components,
  4. have to handle only 1 file (i.e. the packaged model) instead of all the directories and files for a model.

pafonta avatar Apr 07 '21 14:04 pafonta

Just a note: custom architectures can be distributed as python packages that plug into spacy via entrypoints.

Documentation: https://spacy.io/usage/saving-loading#entry-points-components

Example: spacy-transformers, see these lines

Stannislav avatar Jun 07 '21 08:06 Stannislav