open-semantic-etl icon indicating copy to clipboard operation
open-semantic-etl copied to clipboard

Spacy NER text size limit: Segmented NER of longer text

Open opensemanticsearch opened this issue 5 years ago • 1 comments

Spacy NER text size limit is one million chars.

If longer extracted plain text for NER it should be segmented with separete Spacy NER call for each segment.

opensemanticsearch avatar Mar 24 '19 15:03 opensemanticsearch

Solved by https://github.com/opensemanticsearch/spacy-services/pull/3 Tnx!

Todo: Config documentation of new env variable SPACY_MAX_LENGTH

Mandalka avatar Oct 09 '22 09:10 Mandalka