simplemma icon indicating copy to clipboard operation
simplemma copied to clipboard

simplemma.lang_detector import no longer working

Open osma opened this issue 2 years ago • 3 comments

I noticed that the language detection example in the README is no longer working in current main version.

Using simplemma==0.9.1 it works as advertised (although the returned ratios are a bit different from those in the README):

>>> from simplemma import in_target_language, lang_detector
>>> lang_detector('"Exoplaneta, též extrasolární planeta, je planeta obíhající kolem jiné hvězdy než kolem Slunce."', lang=("cs", "sk"))
[('cs', 0.8), ('unk', 0.19999999999999996), ('sk', 0.1)]

But with current main version the import of simplemma.lang_detector is not working:

>>> from simplemma import in_target_language, lang_detector
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name 'lang_detector' from 'simplemma'

It works if I change lang_detector to langdetect:

>>> from simplemma import in_target_language, langdetect
>>> langdetect('"Exoplaneta, též extrasolární planeta, je planeta obíhající kolem jiné hvězdy než kolem Slunce."', lang=("cs", "sk"))
[('cs', 0.75), ('sk', 0.125), ('unk', 0.25)]

Should the documentation be fixed to correspond with the current naming in the code, or the function langdetect renamed back to lang_detector so the API remains stable?

osma avatar Aug 11 '23 07:08 osma

I guess we can use an alias an import it during init. This was a question we discussed with @juanjoDiaz but something must have got lost around the way.

adbar avatar Aug 11 '23 11:08 adbar

This has already been mentioned in #64.

  • The 0.9.1 readme says: from simplemma.langdetect import in_target_language, lang_detector
  • The current readme says from simplemma import in_target_language, lang_detector

So we could go for an alias lang_detector = langdetect in the init file or we could simply rename the function.

In any case, the release notes have to mention that the import strategy is slighly different.

adbar avatar Aug 11 '23 11:08 adbar

See also https://github.com/adbar/simplemma/commit/58b3ee7430568738f306a5386085eda6628c47d4: from simplemma.langdetectfrom simplemma.language_detector

adbar avatar Aug 11 '23 11:08 adbar