stable-diffusion-webui
stable-diffusion-webui copied to clipboard
[feature] Add AltDiffusion
add support for multi-language diffusion model, AltDiffusion-m9.
This seems to completely break support for regular Stable Diffusion models (generating images results in garbage and loading Stable Diffusion models with a config throws an exception) and loading safetensors files throws an exception. Also it removes the usage of commit hashes from launch.py (not a good idea) and disables MPS support in devices.py.
Thank you for your reply, we will fix these issues in the future.
This seems to completely break support for regular Stable Diffusion models (generating images results in garbage and loading Stable Diffusion models with a config throws an exception) and loading safetensors files throws an exception. Also it removes the usage of commit hashes from launch.py (not a good idea) and disables MPS support in devices.py.
Hi~ We have fixed the problems you've mentioned. Could you see the new version again? Thank you~
Can't seem to be able to find the checkpoint to use with this.
Can't seem to be able to find the checkpoint to use with this.
AltDiffusion model: https://model.baai.ac.cn/model-detail/100076
AltDiffusion-M9: https://model.baai.ac.cn/model-detail/100078
This is our technical report: https://arxiv.org/abs/2211.06679
Thank you for your reply!
I'm getting an error that translates as "file does not exist!" when I try to download the model from those links.
From what I've seen in code, you rely on cmd_opts.config to be set, but that's not right as users can just put the config file next to model and select it from dropdown in settings.
I also don't understand why there is a need to add a text_model_name field because you can just like with two existing options check the type of m.cond_stage_model.
@AUTOMATIC1111 try this: AltDiffusion-M9 files - https://drive.filen.io/d/be043c9e-a171-4356-a749-e7840dfcb67e#Z21U5JO7HE5g1fSa7kY5nSZI2dInaUHw
AltDiffusion files - https://drive.filen.io/d/4bb5c13f-70de-4cd5-aeca-317b760b10b4#VDRRlvKjrEzDz13VmY9Ti7D7FGiix5Fu
Well, it's in. See https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#alt-diffusion for how to use this.
In case anyone wants it later for reference/comparison, here are images I generated with it.
Same result with this PR and with my rework of it. Generation info in metadata.
thanks
AltDiffusion-m9 model works terrible. I tried it with Chinese language for dozens of images, nothing good, not a single one. For anyone comes to this post from google, here is my warning: do not use it, do not waste your time.
A popular model with AI based translation tool, like deepl.com, can do way much better than this one. There is no reason to train a new model based on asian language.
Following is this warning in Chinese: AltDiffusion是个国内训练的,可以用中文输入关键词的模型。然而实际尝试后,效果非常差,和当下大部分流行模型的效果都不能比。生成几十张图片下来,没有一张像样的,一张都没有。建议:主流模型+任意翻译服务,生成图片的效果,都远超过这个模型。没有任何去寻找或训练中文模型的必要。
Please show your prompts! 请给出来你的文本描述,没准我们能改进。 I have tried some prompts from lexica. The generated images look fine to me.
EN:portrait of a young handsome dark god, gold wires, intricate, headshot, highly detailed, digital painting, artstation, concept art, sharp focus, cinematic lighting, illustration, art by artgerm and greg rutkowski, alphonse mucha, cgsociety
CN: 年轻英俊的黑暗之神的肖像,金线,错综复杂,爆头,非常详细,数字绘画,artstation,概念艺术,清晰的焦点,电影照明,插图,artgerm和greg rutkowski的艺术,alphonse mucha,cgsociety
Those example photos are so terrible and you think they are fine...Almost all faces are twisted. They are terrible even as art images. If you try those example prompts with photoreal style, it gonna be worse. And I think you know that, otherwise you gonna show them as examples.
Just check the front page of civitai.com, if you can not feel the difference between above examples and those popular images, then I can not help too.
My suggestion is, focus on translation service's training, targeting on prompt's translation. Even if you can train a fine model with asian language, you still can not beat all those free popular models trained by the whole world.
Following is what I generated with other model and the exactly same prompt as above:
portrait of a young handsome dark god, gold wires, intricate, headshot, highly detailed, digital painting, artstation, concept art, sharp focus, cinematic lighting, illustration, art by artgerm and greg rutkowski, alphonse mucha, cgsociety
And following is a model for generating asian characters. Since you are in China, take your examples and following images to your client, ask them which one is better.
Check the face, there is no twisting, all photoreal. And I even havn't added any keyword for photoreal. And these models I used, are not even for photoreal. Just some random mixed models.
The machine translation may be working very well for you. However, machine translations may not work well in all languages. As for the cultural differences, not all artworks have English descriptions or can be descript precisely in English. So, maybe you can give a chance for people outside the English community to conveniently use text-to-image generation tools. This version of AltDiffusion has a very close English performance with the original Stable Diffusion and supports eight different languages.
AltDiffusion aims to expand the language ability of official stable diffusion. and I tried on stable diffusion v1.5 with the same prompts, I get similar images as Altdiffusion.