fuzzy icon indicating copy to clipboard operation
fuzzy copied to clipboard

DMetaphone has issues with long words

Open jaraco opened this issue 12 years ago • 2 comments

Originally reported by: Brian (Bitbucket: eode, GitHub: eode)


#!python

import fuzzy
fdm = fuzzy.DMetaphone()
fdm10 = fuzzy.DMetaphone(10)

# note that this also trounces the 's' phoneme of 'decent'
>>> fdm('decent')
['TKNT', None]

>>> fdm('decentralization')
['TKNT', None]

>>> fdm10('decentralization')
['TKNT', None]


# ..for comparison:
import metaphone
mdm = metaphone.dm

>>> mdm('decent')
('TSNT', '')

>>> mdm('decentralization')
('TSNTRLSXN', '')

Expected behavior:

  • produce phonemes for the whole word, or for the word up to the length specified.

  • Bitbucket: https://bitbucket.org/yougov/fuzzy/issue/5

jaraco avatar Dec 28 '12 04:12 jaraco

Original comment by Sam Ockman (Bitbucket: NewStart, GitHub: NewStart):


Yes, this has bitten me too...

Here's an example I ran across:

for carbohydrate fuzzy gives KRPH as opposed to KRPHTRT for the original library.

It would be great to get this fixed.

Thanks!

jaraco avatar May 13 '13 21:05 jaraco

Original comment by Brian (Bitbucket: eode, GitHub: eode):


..edited for formatting, and added expected behavior

jaraco avatar Dec 28 '12 05:12 jaraco