open-semantic-etl
open-semantic-etl copied to clipboard
Spacy NER text size limit: Segmented NER of longer text
Spacy NER text size limit is one million chars.
If longer extracted plain text for NER it should be segmented with separete Spacy NER call for each segment.
Solved by https://github.com/opensemanticsearch/spacy-services/pull/3 Tnx!
Todo: Config documentation of new env variable SPACY_MAX_LENGTH