dalle-mini
dalle-mini copied to clipboard
Make generation of images possible with prompts in languages other than english
Currently, users can only generate images with prompts in English, as our BART autoregressive model is trained in English, and the whole model intrinsically depends on English text to be input to generate images.
This cannot be readily changed, or this change is not even desirable because it requires a completely different approach and the resulting product will be very different from what it is now.
However, this can be quickly fixed from the point-of-view of the end-users without making any changes to the model by using a free translation API like googletrans
.
This would enable the user to enter the prompt in any language and we will call the API to translate the text into English, and the English text will be passed into the model for image generation.
This makes the model more accessible to users all over the world, without making any intrinsic change to the model itself.
The maintainer of googletrans
mentions here that the API might stop working-
You may wonder why this library works properly, whereas other approaches such like goslate won’t work since Google has updated its translation service recently with a ticket mechanism to prevent a lot of crawler programs.
I eventually figure out a way to generate a ticket by reverse engineering on the obfuscated and minified code used by Google to generate such token, and implemented on the top of Python. However, this could be blocked at any time.
So feel free to suggest a new one, and keep the search on.
We could either use Hugging Face models or the official Google Translate API for translating the prompt
Here are prices for Google Cloud Translate API: https://cloud.google.com/translate/pricing
It says that we have to pay after 500,000 free characters a month. Would this be enough, or should we use googletrans
as mentioned in the first comment?
See also this effort: https://huggingface.slack.com/archives/C025LJDP962/p1627724095070700