spaCy
spaCy copied to clipboard
Add commands for automatically modifying configs
Description
This continues work started in https://github.com/explosion/projects/pull/147, which provides features for automatically manipulating pipelines and configs. The functions included are:
- merge: combine components from two pipelines and handle listeners
- use_transformer: use transformer as feature source
- use_tok2vec: use CNN tok2vec as feature source
- resume: make a version of a config for resuming training
Currently these are all grouped under a new spacy configure command. That may not be the best place for them; in particular, merge may belong elsewhere, since it outputs a pipeline rather than a config.
The current state of the PR is that the commands run, but there's only one small test, and docs haven't been written yet. Docs can be started but will depend somewhat on how the naming issues work out.
Types of change
enhancement
Checklist
- [x] I confirm that I have the right to submit this contribution under the project's MIT license.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
This might require more adjustment - for example, maybe merge should be split out into a separate command - but it is substantially complete and ready for review. Any feedback on how to make the design clearer would be welcome.
Currently these are all grouped under a new spacy configure command. That may not be the best place for them; in particular, merge may belong elsewhere, since it outputs a pipeline rather than a config.
I agree - having merge under spacy configure would be confusing. I'd prefer it as a separate top-level command.
Thanks for the feedback, I moved merge to a separate top-level command.