py-googletrans
py-googletrans copied to clipboard
Translation Inaccuracy on Some Text vs Google Translate Results
Given the following code
>>> import googletrans
>>> translator = googletrans.Translator()
>>>
>>> body = """Hallo,
... leider habe ich gestern vergessen das Pordukt urückzugeben.
... Ich würde mich freuen wenn sie mir noch einen Tag Kulanz einräumen und ich das Gerät auf eigene Kosten zurückschicken kann da ich es nicht gebrauchen kann.
... Es liegt ungeöffnet hier bei mir.
... beste Grüße"""
>>>
>>> print(translator.translate(body, dest="en").text)
Output
Hi there, Unfortunately, I forgot urückzugeben the Pordukt yesterday. I would be happy if they still give me a day goodwill and I can return the equipment at his own expense since I can not use it. It is unopened here with me. best regards
Expected (copied from translate.google.com)
Hi there, Unfortunately yesterday I forgot to return the product. I would be happy if you give me another day of goodwill and I can send the device back at my own expense because I can not use it. It's unopened here with me. best regards
Interestingly, if I translate this line-by-line, the output is better, but does not completely mirror translate.google.com results:
>>> translation = translator.translate(body.split("\n"), dest="en")
>>> print("\n".join(t.text for t in translation))
Hi there, Unfortunately, I forgot to return the product yesterday. I would be happy if they still give me a day goodwill and I can return the equipment at his own expense since I can not use it. It is unopened here with me. best regards
I faced same issue. Why translation so bad?
I also detected what txt.lower().split("\n")
work better
Same for:
from googletrans import Translator
translator = Translator()
text = translator.translate('կոշկեղեն', dest='ru')
print(text.text)
translator = Translator()
text = translator.translate('կոշկեղենի վաճառք', dest='ru')
print(text.text)
Resulted:
обувь
обувь
Google translate browser:
обувь
распродажа обуви
I faced the same issue when translating English to Chinese. It always returns a bad translation. If I input one sentence in https://translate.google.cn/, it returns a good translation while I can see a bad translation after I click the translated sentence. It means bad translation is also provided by Google. By the way, the Chrome full webpage translation returns the same result as the web Google translation, a good translation.
I've personally moved to paid translation. While this module was nice for testing and setup, using this module violates Google TOS, so can't reasonably be used for any professional application. That plus the inaccuracies given here (and other things) means this isn't usable for me.
There is indeed an error in the translation, for example the dutch sentence: 'wonen in een huis waar je energie van krijgt' is incorrectly translated. I have found out that it is because of the library passing a parameter that is probably the model version: https://github.com/ssut/py-googletrans/pull/167
NL: 'wonen in een huis waar je energie van krijgt' EN (official api): live in a house that gives you energy EN (googletrans): live in a house where you energy
Changing one single parameter in the translation request called 'otf' to version 3 will fix the translation inaccuracies.
Similarly for Persian:
translator = Translator()
translation = translator.translate('does ethanol take more energy make that produces?', dest='fa', src='en')
print(translation.text)
which produces:
کند اتانول را به تولید انرژی بیشتر است که تولید؟
While Google translate produces a better translation:
Has this been fixed on Google's end? It seems like googletrans
is producing the desired results without modification.