spaCy icon indicating copy to clipboard operation
spaCy copied to clipboard

annotate cli first try

Open kadarakos opened this issue 1 year ago • 3 comments

New spacy cli command to make annotating new documents with the a trained pipeline convenient by just running a single command.

Description

Adds a new spacy cli command called apply that takes a model and one of:

  1. Single plain text file with one document per line.
  2. .spacy file.
  3. Directory in which case an additional suffix needs to be provided.

to produce a .spacy file containing the Doc objects produced by the model.

Types of change

New feature

Checklist

  • [x] I confirm that I have the right to submit this contribution under the project's MIT license.
  • [ ] I ran the tests, and all new and existing tests passed.
  • [ ] My changes don't require a change to the documentation, or if they do, I've added all required information.

kadarakos avatar Aug 24 '22 15:08 kadarakos

The way the data loading is done is with a bunch of nested try/except blocks, which I feel like is something we would like to avoid in general but it felt valid here.

Also this cli command is really short and simple compared to the rest so I feel like I must have missed a lot of things.

kadarakos avatar Aug 31 '22 13:08 kadarakos

I think it would make more sense to go by file ending here (maybe .spacy vs. not-.spacy) and take it from there?

adrianeboyd avatar Sep 05 '22 15:09 adrianeboyd

I think it would make more sense to go by file ending here (maybe .spacy vs. not-.spacy) and take it from there?

okok and then only the decode error remains in the tryexcept

kadarakos avatar Sep 05 '22 15:09 kadarakos