ru-dalle-paddle
ru-dalle-paddle copied to clipboard
Other playable models-Text2Image
playable models
-
dalle-mini & craiyon https://github.com/borisdayma/dalle-mini
-
CogView2 https://github.com/THUDM/CogView2
待添加
No pretrained models
-
imagen https://github.com/lucidrains/imagen-pytorch
-
文心 ERNIE-ViLG https://wenxin.baidu.com/wenxin/modelbasedetail/ernie_vilg/
待添加
If we have enough time, we will try to migrate. However, I hope that Baidu official can release an open source model of text to image on paddlepaddle. I also know a popular model trained by Tsinghua University, although it is also a pytorch version. CogView2: https://github.com/THUDM/CogView2
If we have enough time, we will try to migrate. However, I hope that Baidu official can release an open source model of text to image on paddlepaddle. I also know a popular model trained by Tsinghua University, although it is also a pytorch version. CogView2: https://github.com/THUDM/CogView2
我刚刚找了一下,文心 ERNIE-ViLG 文本生成图像的能力在开放领域公开数据集 MS-COCO 上进行了验证。评估指标使用 FID(该指标数值越低效果越好), 在 zero-shot 和 finetune 两种方式下,文心 ERNIE-ViLG 都取得了最佳成绩,效果远超 OpenAI 发布的 DALL-E 等模型。他们提供 ERNIE-ViLG API 体验调用的入口,也许你可以联系作者团队,找他们要预训练模型?
I just found it, and the ability of Wenxin ERNIE-ViLG to generate images from text is verified on the open domain public dataset MS-COCO. The evaluation index uses FID (the lower the value of the index, the better the effect). In both zero-shot and finetune methods, Wenxin ERNIE-ViLG has achieved the best results, and the effect is far superior to the models such as DALL-E released by OpenAI. They provide an entry to the ERNIE-ViLG API experience call, maybe you can contact the author team and ask them to pre-train the model?
文心 ERNIE-ViLG https://wenxin.baidu.com/wenxin/modelbasedetail/ernie_vilg/ paper: https://arxiv.org/pdf/2112.15283.pdf
Another project with code and models
- ERNIE-SAT 类别文心·跨模态大模型 应用语音编辑、语音生成、语音克隆、带语音克隆的语音到语音翻译
ERNIE-SAT 采用语音-文本联合训练的方式在中文和英文数据集上进行预训练。使得模型学到了语音和文本的对齐关系,并且生成频谱的精度更高,合成声音的质量更高。
https://wenxin.baidu.com/wenxin/modelbasedetail/ernie_sat/