ImageBind The issue about Audio to Image Generation

An amazing work!!!

It's well known that https://github.com/lucidrains/DALLE2-pytorch and https://github.com/LAION-AI/dalle2-laion used open-clip as pretrianed text and image encoder. However, I have noticed that you used a private DALLE-2 to generate the image conditioned on audio.

Whether is it possible to use open source DALLE-2 instea of private reimplemented counterpart? Does it have some problems with open source DALLE-2? I would appreciate if you can share experience.

In my view, If it was possible to use open source DALLE-2 to adapt the ImageBind, it could directly create some very interesting applications and increase the impact of this work!

May 12 '23 05:05 liu-zhy

Can someone help me? Thanks!

May 15 '23 05:05 liu-zhy

We tried audio to image using Stable Diffusion. The project is open-sourced: https://github.com/sail-sg/BindDiffusion

May 16 '23 05:05 xuxy09

We tried audio to image using Stable Diffusion. The project is open-sourced: https://github.com/sail-sg/BindDiffusion

Wow, great work, I have starred this repo!

May 16 '23 06:05 liu-zhy