abbreviation-extraction
abbreviation-extraction copied to clipboard
Ability to customize short-form/long-form detction of S-H algorithm
That is the result of the Schwartz-Hearst algorithm itself. After detecting the two d's at the end and start of Derived, it extracts that.
If you expect more such cases, update the logic in select_definition
to only match short-form char with long-form when it's at the start of a word L247 https://github.com/philgooch/abbreviation-extraction/blob/2e334bbe474a4030c07860839c023775bb97c4ae/abbreviations/schwartz_hearst.py#L247)
Hi @gopalkalpande @sid-sundrani thanks for this. This is an open-source project, so if you can submit a pull-request with a fix and accompanying unit test, that will be very welcome.