pySBD
pySBD copied to clipboard
🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.
**Describe the bug** Instances of `--` often break the segmentation. When these are replaced, segmenting the same sentence works as expected. **To Reproduce** Breaking Examples: 1. ``` "Volumes within all...
🌐 Added Support for the Bangla Language This pull request introduces support for the Bangla language within the project. Given the linguistic similarity between Bangla and Hindi, where both languages...
**Describe the bug** A text containing a particular combination of single quotes doesn't get segmented. **To Reproduce** Steps to reproduce the behavior: Input text - Come work for us in...
When dealing with a long statement of facts quoted from legal text, the text is not split up within left double quotations and write double quotations. this is different than...
**Describe the bug** Control characters like `\x1f` break German sentence segmentation at `format_numbered_list_with_periods` step. **To Reproduce** Steps to reproduce the behavior: Input text - `'1.\x1f\x1fApfel\x1d2.\x1f\x1fBanana'` Code: ``` import pysbd example_text...
**Describe the bug** When an open parenthesis appears in certain situations in German text, it can cause a crash when running sentence splitting. **To Reproduce** from pysbd import Segmenter text...
**Describe the bug** A clear and concise description of what the bug is. **To Reproduce** input_str = """This is part 3 of MAMI-san's hair timelineThe previous hair timelines can be...
- Based on the Slovak language that is very close to Czech one I have created initial support for Czech language sentence splitting
Bumps [nltk](https://github.com/nltk/nltk) from 3.5 to 3.9. Changelog Sourced from nltk's changelog. Version 3.9.1 2024-08-19 Fixed bug that prevented wordnet from loading Version 3.9 2024-08-18 Avoid need for pickled models, resolves...
German texts often use a pair of `„` and to `“,` to delineate quoted text. These cause issues for example in the below text: `Nach einem kurzen Zögern näherte sie...