abbreviation-extraction icon indicating copy to clipboard operation
abbreviation-extraction copied to clipboard

Ability to customize short-form/long-form detction of S-H algorithm

Open gopalkalpande opened this issue 4 years ago • 2 comments

image

image

gopalkalpande avatar Jun 09 '20 05:06 gopalkalpande

That is the result of the Schwartz-Hearst algorithm itself. After detecting the two d's at the end and start of Derived, it extracts that.

If you expect more such cases, update the logic in select_definition to only match short-form char with long-form when it's at the start of a word L247 https://github.com/philgooch/abbreviation-extraction/blob/2e334bbe474a4030c07860839c023775bb97c4ae/abbreviations/schwartz_hearst.py#L247)

sidhantls avatar Apr 13 '21 07:04 sidhantls

Hi @gopalkalpande @sid-sundrani thanks for this. This is an open-source project, so if you can submit a pull-request with a fix and accompanying unit test, that will be very welcome.

philgooch avatar Apr 13 '21 10:04 philgooch