dalle-playground
dalle-playground copied to clipboard
Running on Apple Silicon
JAX is not yet fully supported on the Apple Silicon GPU. See here: https://github.com/google/jax/issues/8074#issuecomment-1148982985
You may get it work -- very slowly -- using the CPU version of GPU.
running dalle-playground on-CPU on M1 Max is ~128x slower than the huggingface demo.
i.e. it can generate 1 image in 27 minutes. 117% CPU usage. you can get more throughput by submitting concurrent tasks. best throughput I achieved was with 5 concurrent tasks; 5 1-image tasks in 52 mins. 550% CPU usage. so 10.4 mins per pic.
there is a path to get it running on-GPU via IREE/Vulkan-over-MoltenVK. first we need to get IREE backend to lower mhlo.scatter
operations correctly:
https://github.com/google/iree/issues/9361
work in-progress:
https://github.com/google/iree/pull/9378
are you running mega on M1 Max? Also how did you set it up for concurrent tasks?
Yes, I'm using M1 Max. It's only concurrent in the sense that I open multiple browser tabs of the dalle-playground frontend, and ask for 1 image via each browser tab simultaneously. It's still an order of magnitude longer duration and less throughput than the online dalle-mini demo. I've asked and apparently that online demo uses the dalle-mega model anyway.
Ah was hoping to use a bigger model to have some improved resolution on the images generated, didn't know the online version was also using mega
All DALL-E models (the original Open AI DALL-E, mini, mega, mega-fp16) output the same resolution images. They're all based on the same model architecture. Open AI's DALL-E 2 model architecture outputs 1024x1024. You can either upscale the output from DALL-E mini or utilize a different model architecture. CompVis' Latent Diffusion model allows for setting image size: https://github.com/CompVis/latent-diffusion