google-translate-api icon indicating copy to clipboard operation
google-translate-api copied to clipboard

Nonsense generated for haw (wrong translation)

Open avisitor opened this issue 4 years ago • 23 comments

const translate = require('@vitalets/google-translate-api');

var text = 'Ia hala ana mai o ka waa o Pele, ia wa i hoouna mai ai o Kahinalii ka makuahine, i ke kai hoee nui a ka launa ole, a lewa ana ka waa o Honuaiakea iluna o ka halehale hanupanupa kuhoho a kawehaweha o ke kai. Ua huahuai ae la na mapuna o ke kai ma lalo ae o ka papaku o ka moana, hakikili ka ua mai ka lani mai. Olaolapa ka uwela i ka lewa uli, nakolokolo ikuwa ka leo papaaina o ka hekili, huikau ka lewa nuu, ka lewa lalo. Auwe! He ino!!'; translate(text, {from: 'haw', to: 'en'}).then(res => { console.log(res.text); //console.log(res.from.language.iso); }).catch(err => { console.error(err); });


$ node translate.js The youngest birdmen came to whom there, the whole drivenaka had ate fish, and in the land of the sea Shell.All the board of the sea of the sea became the bottom of the sea floor, the feature of the sky.The air, Do otherwise social, Judge Mount Hords, communions to the air Nuunu.Wow!Is a bad !!


When pasted into translate.google.com:

When Pele's canoe passed, the mother sent Kahinalii to paddle a great and incompatible paddle, and the canoe from Honuaiakea flew over the mysterious building and the depths of the sea. The waves of the sea poured forth beneath the sea floor, and the rain from heaven thundered. The lightning flashed in the dark sky, the table sound of thunder roared, the sky above and the sky below were confused. Alas! It's bad !!

avisitor avatar Mar 06 '21 20:03 avisitor

I have been encountering the same issue. I think the v5 version has a worse quality in comparison to v4. Perhaps it's due to the different APIs used.

see also #71

edit: I tried downgrade the SDK to v4. But found all responses become BAD_REQUEST. (#64 ) I cannot use it any more. The end of an era 😢

songkeys avatar Mar 11 '21 08:03 songkeys

The random nonsense may be because the server doesn't receive a required header called X-Goog-BatchExecute-Bgr, so our request was recognized as a robot or others illegal. I can't figure the concrete algorithm out yet. It looks like very complicated from the code. It would be great if anyone can help to make the encryption logic of this header clear.

plainheart avatar Mar 13 '21 07:03 plainheart

@plainheart Nice investigation! I can confirm this. With X-Goog-BatchExecute-Bgr, the result will be the same as the Web browser's one.

songkeys avatar Mar 14 '21 08:03 songkeys

I forked this repo and added two new endpoints that can work better than the current website endpoint. But I still hope the algorithm of the header could be figured out.

plainheart avatar Mar 15 '21 01:03 plainheart

I generated an X-Goog-BatchExecute-Bgr yesterday. After 24+ hours, today I found that I can still use it to get the accurate result. I'm not sure how long this ("token"?) will remain valid but I'll keep checking this.

It seems that we can at least use puppeteer + cache method for it if it's hard to extract the algorithm.

songkeys avatar Mar 15 '21 09:03 songkeys

I ever dug into the source code, the header value may be related to the query string, if we changed the query string, it should be invalid immediately theoretically. Does it still work for you with the text different from the previous?

plainheart avatar Mar 15 '21 10:03 plainheart

Yes.. You were right. It's generated from the query string. It won't work if I changed my text. Yes... So we have to extract the algorithm then.

songkeys avatar Mar 15 '21 10:03 songkeys

I generated an X-Goog-BatchExecute-Bgr yesterday. After 24+ hours, today I found that I can still use it to get the accurate result. I'm not sure how long this ("token"?) will remain valid but I'll keep checking this.

It seems that we can at least use puppeteer + cache method for it if it's hard to extract the algorithm.

How did you generate an X-Goog-BatchExecute-Bgr,Thank you!

xsxiong avatar Mar 24 '21 12:03 xsxiong

Thank for the research! Does the problem occurs only on haw? I've tested for ru -> en, translation by lib differs from google translate website, but still very close by sense.

vitalets avatar Apr 14 '21 11:04 vitalets

@vitalets Hi, thanks for your reply.

Does the problem occur only on haw?

No. I'm not sure if the other languages have the same issue. But I can confirm it is existing in Chinese(zh).

translation by lib differs from google translate website, but still very close by sense.

Please refer to #71. Though the translation result is correct generally, there are many unexpected mixed upper-case and lower-case letters in sentences, which affects the normal reading. Besides, the translation looks close by sense, but it has many small grammar issues comparing to the website.

Everything would be okay if a valid value for the header X-Goog-BatchExecute-Bgr could be provided. However, it seems to be hard to figure out how to calculate.

plainheart avatar Apr 15 '21 00:04 plainheart

Everything would be okay if a valid value for the header X-Goog-BatchExecute-Bgr could be provided. However, it seems to be hard to figure out how to calculate.

Yeah( Google protects the batch API from such access.

vitalets avatar Apr 23 '21 09:04 vitalets

@vitalets I currently use your google-translate-api nodejs library but many users have been complaining that the translation is not accurate and does not match results from the google translate website. So i have been looking into the different ways of getting better translation. I found that if you use the url translate.google.com/translate_a/t or translate.google.com/translate_a/single along with a client like dict-chrome-ex, the results match the ones from google translate website. Since we have not been able to figure out how to generate the X-Goog-BatchExecute-Bgr, i would suggest looking in the translate_a url.

here is an example of the translate.googleapis.com/translate_a/single/ url

I also found that this library google-translate-open-api uses translate.google.com/translate_a/t url. so i was able to use in this way to replace google-translate-api on my server for more accurate translations:

const translateOpenApi = require('google-translate-open-api');
translateOpenApi.default(`some text`, {
    client: "dict-chrome-ex",
    to: 'en'
  }).then(result => {
    const translatedText = result.data.sentences.map(s => s.trans).join('');
    var formattedResult = {
      text: translatedText, 
      from :{
        language: { iso: result.data.src }
      }
    }
    console.log(formattedResult)
  })

Hope this helps and gets integrated in into google-translate-api.

f0enix avatar Apr 26 '21 12:04 f0enix

@vitalets I currently use your google-translate-api nodejs library but many users have been complaining that the translation is not accurate and does not match results from the google translate website. So i have been looking into the different ways of getting better translation. I found that if you use the url translate.google.com/translate_a/t or translate.google.com/translate_a/single along with a client like dict-chrome-ex, the results match the ones from google translate website. Since we have not been able to figure out how to generate the X-Goog-BatchExecute-Bgr, i would suggest looking in the translate_a url.

here is an example of the translate.googleapis.com/translate_a/single/ url

I also found that this library google-translate-open-api uses translate.google.com/translate_a/t url. so i was able to use in this way to replace google-translate-api on my server for more accurate translations:

const translateOpenApi = require('google-translate-open-api');
translateOpenApi.default(`some text`, {
    client: "dict-chrome-ex",
    to: 'en'
  }).then(result => {
    const translatedText = result.data.sentences.map(s => s.trans).join('');
    var formattedResult = {
      text: translatedText, 
      from :{
        language: { iso: result.data.src }
      }
    }
    console.log(formattedResult)
  })

Hope this helps and gets integrated in into google-translate-api.

That is true for the moment, however I suspect Google will soon turn that off in favor of their new RPC method

ArtanisTheOne avatar May 03 '21 19:05 ArtanisTheOne

For anyone who wants an accurate translation: I have an alternative solution using the Puppeteer to scrape result directly: https://github.com/Songkeys/Translateer. It needs more resources (due to the Puppeteer, you know) and perhaps slower (~~around 1~5s for each response~~ edit: after an upgrade, should be within 500ms) but it's accurate.

songkeys avatar Nov 21 '21 12:11 songkeys

For anyone who wants an accurate translation: I have an alternative solution using the Puppeteer to scrape result directly: https://github.com/Songkeys/Translateer. It needs more resources (due to the Puppeteer, you know) and perhaps slower (around 1~5s for each response) but it's accurate.

Good approach. Will add to readme.

vitalets avatar Feb 26 '22 10:02 vitalets

Anyone who looking accurate translate can use that code. That code uses another google translate route for translate and has the same translation result as the website.

allohamora avatar Mar 15 '22 15:03 allohamora

@vitalets I currently use your google-translate-api nodejs library but many users have been complaining that the translation is not accurate and does not match results from the google translate website. So i have been looking into the different ways of getting better translation. I found that if you use the url translate.google.com/translate_a/t or translate.google.com/translate_a/single along with a client like dict-chrome-ex, the results match the ones from google translate website. Since we have not been able to figure out how to generate the X-Goog-BatchExecute-Bgr, i would suggest looking in the translate_a url.

here is an example of the translate.googleapis.com/translate_a/single/ url

I also found that this library google-translate-open-api uses translate.google.com/translate_a/t url. so i was able to use in this way to replace google-translate-api on my server for more accurate translations:

const translateOpenApi = require('google-translate-open-api');
translateOpenApi.default(`some text`, {
    client: "dict-chrome-ex",
    to: 'en'
  }).then(result => {
    const translatedText = result.data.sentences.map(s => s.trans).join('');
    var formattedResult = {
      text: translatedText, 
      from :{
        language: { iso: result.data.src }
      }
    }
    console.log(formattedResult)
  })

Hope this helps and gets integrated in into google-translate-api.

It shows that "TypeError: Cannot read properties of undefined (reading 'map')". Doesn't work anymore.

poowu avatar Mar 24 '22 03:03 poowu

@vitalets

Is there any active solution on the incorrect translations? Or any other way of integrating translations with Google?

I am very curious if someone has an solution to the problem.

kevinvugts avatar May 18 '22 15:05 kevinvugts

any update? thanks

saviourdog avatar Sep 19 '22 14:09 saviourdog

Yes.. You were right. It's generated from the query string. It won't work if I changed my text. Yes... So we have to extract the algorithm then.

I've found it's related to your client IP however, from my real IP unproxied autocorrect fails in some cases(without X-Goog-BatchExecute-Bgr sent), but testing using a VPN(or in Github actions testing) it is not required. It could be a check that is only used when an IP makes a high number of requests(or could be related to the provider/classification of the IP)

AidanWelch avatar Oct 14 '22 22:10 AidanWelch

Certain networks require the X-Goog-BatchExecute-Bgr header to be sent on requests, or the autocorrect will not be applied to some translations(seemingly typos where a letter is dropped, such as "I spea Dutch!" instead of "I speak Dutch!").

The code for generating this header I believe is found in this static script.

I believe in xH.prototype.s()

This would likely take a while to fix.

From google-translate-api-x#18

AidanWelch avatar Oct 14 '22 22:10 AidanWelch

Hey everyone! Using advice of @allohamora I've fully rewritten the library with another google translate route (see https://github.com/vitalets/google-translate-api/issues/70#issuecomment-1068138219). Now translation exactly matches the result from google translate website.

New version is available on npm as next release:

npm install @vitalets/google-translate-api@next

Original text of this issue is translated correctly:

import { translate } from '@vitalets/google-translate-api';

const { text } = await translate('Ia hala ana mai o ka waa o Pele, ia wa i hoouna mai ai o Kahinalii ka makuahine, i ke kai hoee nui a ka launa ole, a lewa ana ka waa o Honuaiakea iluna o ka halehale hanupanupa kuhoho a kawehaweha o ke kai. Ua huahuai ae la na mapuna o ke kai ma lalo ae o ka papaku o ka moana, hakikili ka ua mai ka lani mai. Olaolapa ka uwela i ka lewa uli, nakolokolo ikuwa ka leo papaaina o ka hekili, huikau ka lewa nuu, ka lewa lalo. Auwe! He ino!!');
console.log(text);

Output:

When Pele's boat passed, Kahinalii, the mother, sent a great storm of waves, and Honuaiakea's boat hovered over the hanupanupa house, which was deep and divided by the sea. The springs of the sea broke out below the sea floor, rain fell from the sky. The lightning is bright in the green sky, the sound of the thunder is echoing, the green sky is confused, the lower air is confused. Alas! It's bad!!

Output from website: image

Also I've removed all outdated and vulnerable dependencies, added support of react-native and rewritten in typescript.

I will appreciate if you install and check this beta version in your scenarios and share the feedback here. If everything is ok, I'm ready to release it as main version. Please note that new version shape is uncompatible with previous one. Thanks in advance!

@avisitor @Songkeys @plainheart @poowu @kevinvugts @saviourdog @AidanWelch

vitalets avatar Oct 18 '22 19:10 vitalets