ml-mdm
ml-mdm copied to clipboard
Train high-quality text-to-image diffusion models in a data & compute efficient manner
implementation and tests running
## Key Architectural Changes - **Unsqueeze Process** Adjusted tensor dimension expansion logic to ensure proper shape compatibility between temporal embeddings and spatial feature maps. - **Temporal Embedding Broadcast** Modified broadcasting...
Debugged @bdeanhardt tests; they seem to be working now.
I'm trying to use the model from this link(https://docs-assets.developer.apple.com/ml-research/models/mdm/flickr1024/vis_model.pth) with precision torch.float16. However, I encountered a nan result. By debugging the model inference process, I found that some activation values...
Need to add Bella's mlx_unet when ready. Also need to test the file before merging.
If CUDA is available: loads the language model in 8-bit quantized format using bitsandbytes Else: loads the LM in torch.float16 One could also look into using [CTranslate2](https://github.com/OpenNMT/CTranslate2) for quantization, which...