pySBD
pySBD copied to clipboard
Fix spacy component example (issue #96)
Update the example to work with the latest spacy as installed by pip install spacy
(version 3.0.6), and fix failure to segment sentences due to doc.char_span
returning None. Fixes #96.
Add pipe to spacy model Use the current spacy recommended way to add the pipe to a model.
Fix sentences not split due to extra chars
The doc.char_span
uses alignment_mode="strict"
by default, which returns None
when sent_char_spans
contains trailing spaces, for example. Change the alignment_mode
to "contract"
so that it returns correct spans.
Why is this not merged yet?
@nipunsadvilkar
@nipunsadvilkar