Sana
Sana copied to clipboard
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
When I use the downloaded model weights for inference: python inference.py \ --config=/code/configs/sana_config/1024ms/Sana_1600M_img1024.yaml \ --model_path=/models/Sana_1.6_1024/Sana_1600M_1024px_BF16/checkpoints/Sana_1600M_1024px_BF16.pth \ --txt_file=/code/asset/samples/samples_mini.txt An error occurs: RuntimeError: mat1 and mat2 shapes cannot be multiplied (600x2048 and...
``` [rank5]: File "hgcfvb", line 148, in sdfass [rank5]: dcp.load( [rank5]: File "/opt/conda/lib/python3.11/site-packages/torch/distributed/checkpoint/logger.py", line 83, in wrapper [rank5]: result = func(*args, **kwargs) [rank5]: ^^^^^^^^^^^^^^^^^^^^^ [rank5]: File "/opt/conda/lib/python3.11/site-packages/torch/distributed/checkpoint/utils.py", line 429, in...
I want to LoRA fine-tune on the Efficient-Large-Model/Sana_Sprint_1.6B_1024px_diffusers, but I guess currently we can LoRA fine-tune on only the Efficient-Large-Model/Sana_1600M_1024px_BF16. I found the script here: https://github.com/NVlabs/Sana/blob/main/asset/docs/sana_lora_dreambooth.md But when I replace...
Great work! Just one problem I found when generating the images, I found that SANA-sprint uses DPMSolverMultistepScheduler, which causes it to give TypeError: set_timesteps() got an unexpected keyword argument 'max_timesteps'
hi! your sana controlnet is impressive, but it takes too long... curious if u can also release an img2img version - which in theory, should be also 0.1s or less...
Hi, I checked Controlnet and Sana's documentation. Can I use depth or a normal image as a guide instead of an RGB image? Thank you!
Trying the Sana workflows in ComfyUI as on this page https://github.com/NVlabs/Sana/blob/main/asset/docs/ComfyUI/comfyui.md I installed the latest Comfy clean, then follow those instructions. ComfyUI starts and I open the Sana_FlowEuler.json workflow. When...
You mentioned in the report that Triton was used for kernel fusion, but the corresponding functions for attn and ffn in the config were not called. Could you tell me...
I am using the image dataset used in dreambooth paper here https://github.com/google/dreambooth/tree/main/dataset/can on model **Efficient-Large-Model/Sana_1600M_1024px_BF16_diffusers** ``` accelerate launch /mnt/sdc/zhouyayue/projects/Sana/train_scripts/train_dreambooth_lora_sana.py \ --pretrained_model_name_or_path="Efficient-Large-Model/Sana_1600M_1024px_BF16_diffusers" \ --instance_data_dir="/mnt/sdc/zhouyayue/projects/Sana/dreambooth_data" \ --output_dir="/mnt/sdc/zhouyayue/projects/Sana/output_dreambooth_lora" \ --mixed_precision="bf16" \ --instance_prompt="a photo...