dalle-mini icon indicating copy to clipboard operation
dalle-mini copied to clipboard

Make generation of images possible with prompts in languages other than english

Open ritog opened this issue 3 years ago • 4 comments

Currently, users can only generate images with prompts in English, as our BART autoregressive model is trained in English, and the whole model intrinsically depends on English text to be input to generate images.

This cannot be readily changed, or this change is not even desirable because it requires a completely different approach and the resulting product will be very different from what it is now.

However, this can be quickly fixed from the point-of-view of the end-users without making any changes to the model by using a free translation API like googletrans.

This would enable the user to enter the prompt in any language and we will call the API to translate the text into English, and the English text will be passed into the model for image generation.

This makes the model more accessible to users all over the world, without making any intrinsic change to the model itself.

ritog avatar Jul 23 '21 05:07 ritog

The maintainer of googletrans mentions here that the API might stop working-

You may wonder why this library works properly, whereas other approaches such like goslate won’t work since Google has updated its translation service recently with a ticket mechanism to prevent a lot of crawler programs.

I eventually figure out a way to generate a ticket by reverse engineering on the obfuscated and minified code used by Google to generate such token, and implemented on the top of Python. However, this could be blocked at any time.

So feel free to suggest a new one, and keep the search on.

ritog avatar Jul 23 '21 05:07 ritog

We could either use Hugging Face models or the official Google Translate API for translating the prompt

borisdayma avatar Jul 23 '21 18:07 borisdayma

Here are prices for Google Cloud Translate API: https://cloud.google.com/translate/pricing

It says that we have to pay after 500,000 free characters a month. Would this be enough, or should we use googletrans as mentioned in the first comment?

ritog avatar Jul 31 '21 04:07 ritog

See also this effort: https://huggingface.slack.com/archives/C025LJDP962/p1627724095070700

pcuenca avatar Jul 31 '21 11:07 pcuenca