spaCy icon indicating copy to clipboard operation
spaCy copied to clipboard

Add commands for automatically modifying configs

Open polm opened this issue 2 years ago • 3 comments

Description

This continues work started in https://github.com/explosion/projects/pull/147, which provides features for automatically manipulating pipelines and configs. The functions included are:

  • merge: combine components from two pipelines and handle listeners
  • use_transformer: use transformer as feature source
  • use_tok2vec: use CNN tok2vec as feature source
  • resume: make a version of a config for resuming training

Currently these are all grouped under a new spacy configure command. That may not be the best place for them; in particular, merge may belong elsewhere, since it outputs a pipeline rather than a config.

The current state of the PR is that the commands run, but there's only one small test, and docs haven't been written yet. Docs can be started but will depend somewhat on how the naming issues work out.

Types of change

enhancement

Checklist

  • [x] I confirm that I have the right to submit this contribution under the project's MIT license.
  • [x] I ran the tests, and all new and existing tests passed.
  • [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

polm avatar Dec 23 '22 10:12 polm

This might require more adjustment - for example, maybe merge should be split out into a separate command - but it is substantially complete and ready for review. Any feedback on how to make the design clearer would be welcome.

polm avatar Jan 13 '23 05:01 polm

Currently these are all grouped under a new spacy configure command. That may not be the best place for them; in particular, merge may belong elsewhere, since it outputs a pipeline rather than a config.

I agree - having merge under spacy configure would be confusing. I'd prefer it as a separate top-level command.

rmitsch avatar Feb 08 '23 15:02 rmitsch

Thanks for the feedback, I moved merge to a separate top-level command.

polm avatar Feb 09 '23 06:02 polm