Awesome_Diffusions
Awesome_Diffusions copied to clipboard
I only put the resources (e.g. papers, blogs, etc.) that I have read and found interesting in this list. So the update speed would be depending on my speed of reading stuffs.
Catalogue:
-
1. Vision
- 1.1. Text-to-Image Generation
- 1.2. Object Detection
- 1.3. Image Generation
-
2. Language
- 2.1. Text Generation
-
3. Vision and Language
- 3.1. Image Captioning
- 4. Other Topics
- 5. Blogs and Other Resources
1. Vision: [Back to Top]
1.1. Text-to-Image Generation:
- "AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities" Zhongzhi Chen, Guang Liu, Bo-Wen Zhang, Fulong Ye, Qinghong Yang, Ledell Wu; [arxiv][code]
- "Hierarchical Text-Conditional Image Generation with CLIP Latents" Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, Mark Chen; [arxiv]
- "High-Resolution Image Synthesis with Latent Diffusion Models" Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Björn Ommer; [arxiv]
1.2. Object Detection:
- "DiffusionDet: Diffusion Model for Object Detection" Shoufa Chen, Peize Sun, Yibing Song, Ping Luo; [arxiv][code]
1.3. Image Generation:
-
"DENOISING DIFFUSION IMPLICIT MODELS" Jiaming Song, Chenlin Meng, Stefano Ermon; [arxiv][code]
-
"Adding Conditional Control to Text-to-Image Diffusion Models" Lvmin Zhang and Maneesh Agrawala; [arxiv][code]
2. Language: [Back to Top]
2.1. Text Generation:
- "Diffusion-LM Improves Controllable Text Generation" Xiang Lisa Li, John Thickstun, Ishaan Gulrajani, Percy Liang, Tatsunori B. Hashimoto; [arxiv][code]
- "DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models" Zhengfu He, Tianxiang Sun, Kuanning Wang, Xuanjing Huang, Xipeng Qiu; [arxiv][code].
- "GENIE: Large Scale Pre-training for Generation with Diffusion Model" Zhenghao Lin, Yeyun Gong, Yelong Shen, Tong Wu, Zhihao Fan, Chen Lin, Weizhu Chen, Nan Duan; [arxiv]
- "Difformer: Empowering Diffusion Model on Embedding Space for Text Generation" Zhujin Gao, Junliang Guo, Xu Tan, Yongxin Zhu, Fang Zhang, Jiang Bian, Linli Xu; [arxiv]
- "DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models" Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, Lingpeng Kong; [arxiv][code]
- "Latent Diffusion for Language Generation" Justin Lovelace, Varsha Kishore, Chao Wan, Eliot Shekhtman, Kilian Weinberger; [arxiv]
3. Vision and Language: [Back to Top]
3.1. Image Captioning:
- "Exploring Discrete Diffusion Models for Image Captioning" Zixin Zhu, Yixuan Wei, Jianfeng Wang, Zhe Gan, Zheng Zhang, Le Wang, Gang Hua, Lijuan Wang, Zicheng Liu, Han Hu; [arxiv][code]
4. Other Topics: [Back to Top]
- "CARD: Classification and Regression Diffusion Models" Xizewen Han, Huangjie Zheng, Mingyuan Zhou; [arxiv][code]
5. Blogs and Other Resources: [Back to Top]
- "The Illustrated Stable Diffusion" Jay Alammar; [link]
- "How diffusion models work: the math from scratch" Sergios Karagiannakos, Nikolas Adaloglou; [link]
- Github Repo for minimal-text-diffusion