Sana icon indicating copy to clipboard operation
Sana copied to clipboard

ComfyUI workflow not working

Open SoftologyPro opened this issue 7 months ago • 16 comments
trafficstars

Trying the Sana workflows in ComfyUI as on this page https://github.com/NVlabs/Sana/blob/main/asset/docs/ComfyUI/comfyui.md

I installed the latest Comfy clean, then follow those instructions. ComfyUI starts and I open the Sana_FlowEuler.json workflow.

When I click Run it fails with this error

Prompt outputs failed validation
ExtraVAELoader:
    - Value not in list: vae_type: 'dcae-f32c32-sana-1.1-diffusers' not in ['kl-f4', 'kl-f8', 'kl-f8-d16', 'kl-f16', 'kl-f32', 'vq-f4', 'vq-f8', 'vq-f16', 'Consistency-Decoder', 'SDV-VideoDecoder', 'MoVQ3', 'dcae-f32c32-sana-1.0-diffusers']
    - Value not in list: vae_name: 'mit-han-lab/dc-ae-f32c32-sana-1.1-diffusers' not in ['mit-han-lab/dc-ae-f32c32-sana-1.0-diffusers']

So it looks like "All the checkpoints will be downloaded automatically." does not happen.

Also, if the results of this are not as good as they can be https://github.com/NVlabs/Sana/issues/176 should we even bother using them for now? If the latest status is (from 6 weeks ago) is "Implementing Flow-DPM-Solver in ComfyUI is not the highest priority stuff in our progress actually." maybe you should update the page with "don't bother for now until we implement it fully as we intend to do sometime".

SoftologyPro avatar Mar 28 '25 21:03 SoftologyPro

Prompt outputs failed validation ExtraVAELoader: - Value not in list: vae_type: 'dcae-f32c32-sana-1.1-diffusers' not in ['kl-f4', 'kl-f8', 'kl-f8-d16', 'kl-f16', 'kl-f32', 'vq-f4', 'vq-f8', 'vq-f16', 'Consistency-Decoder', 'SDV-VideoDecoder', 'MoVQ3', 'dcae-f32c32-sana-1.0-diffusers'] - Value not in list: vae_name: 'mit-han-lab/dc-ae-f32c32-sana-1.1-diffusers' not in ['mit-han-lab/dc-ae-f32c32-sana-1.0-diffusers']

Seems it's due to the recent update. I have fixed the problem.

Also, if the results of this are not as good as they can be https://github.com/NVlabs/Sana/issues/176 should we even bother using them for now?

Actually, we can use the Flow-Euler sampler (so called K-sampler in ComfyUI) as well. When the steps come to >28 steps, the quality is similar. No need to bother about the Flow-DPM-Solver at all.

lawrence-cj avatar Mar 30 '25 08:03 lawrence-cj

Thank you. The auto downloads now wotk for the Sana_FlowEuler.json and Sana_FlowEuler_4K.json workflows.

The Sana_CogVideoX.json workflow gives errors about missing models

Image

Would you know what models need to be downloaded from where and to what directory for the Sana_CogVideoX.json workflow to work without errors? Or can you modify the workflow to also download the required models?

SoftologyPro avatar Mar 30 '25 09:03 SoftologyPro

Could you try our newest Sana_CogVideoX.json file. We don't have the google/gemma-2-2b-it anymore.

lawrence-cj avatar Mar 30 '25 10:03 lawrence-cj

OK, that fixes the model error. But when the workflow is run it only gets as far as the image being generated. No movie is created?

Image

SoftologyPro avatar Mar 30 '25 10:03 SoftologyPro

Seems like you are getting an error when creating videos. Not the problem of the image part.

lawrence-cj avatar Mar 30 '25 11:03 lawrence-cj

These are the stats once ComfyUI finishes loading and I click Run.

[ComfyUI-Manager] All startup tasks have been completed.
got prompt
Failed to validate prompt for output 25:
* CLIPLoader 29:
  - Value not in list: clip_name: 'text_encoders/t5xxl_fp16.safetensors' not in ['models-efficient-large-model--gemma-2-2b-it\\gemma-2-2b-it.safetensors', 'models-efficient-large-model--gemma-2-2b-it\\model-00001-of-00002.safetensors', 'models-efficient-large-model--gemma-2-2b-it\\model-00002-of-00002.safetensors']
Output will be ignored
Fetching 12 files:   0% 0/12 [00:00<?, ?it/s]
Fetching 12 files: 100% 12/12 [00:00<00:00, 6000.43it/s]
Loading checkpoint shards:   0% 0/2 [00:00<?, ?it/s]
Loading checkpoint shards: 100% 2/2 [00:00<00:00, 32.79it/s]
model_type FLOW
D:\ComfyUI\ComfyUI\.venv\lib\site-packages\timm\models\layers\__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
  warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning)
Missing UNET keys ['pos_embed']
Requested to load EXM_Sana_Model
loaded completely 16487.57625656128 3059.7485961914062 True
  0% 0/28 [00:00<?, ?it/s]
  4% 1/28 [00:00<00:05,  4.85it/s]
 11% 3/28 [00:00<00:02,  9.24it/s]
 18% 5/28 [00:00<00:02, 11.16it/s]
 25% 7/28 [00:00<00:01, 12.17it/s]
 32% 9/28 [00:00<00:01, 12.80it/s]
 39% 11/28 [00:00<00:01, 13.19it/s]
 46% 13/28 [00:01<00:01, 13.46it/s]
 54% 15/28 [00:01<00:00, 13.64it/s]
 61% 17/28 [00:01<00:00, 13.69it/s]
 68% 19/28 [00:01<00:00, 13.78it/s]
 75% 21/28 [00:01<00:00, 13.81it/s]
 82% 23/28 [00:01<00:00, 13.85it/s]
 89% 25/28 [00:01<00:00, 13.89it/s]
 96% 27/28 [00:02<00:00, 13.92it/s]
100% 28/28 [00:02<00:00, 13.04it/s]
Prompt executed in 22.19 seconds

The Load CLIP node has a red outline and there is the error message

* CLIPLoader 29:
  - Value not in list: clip_name: 'text_encoders/t5xxl_fp16.safetensors' not in ['models-efficient-large-model--gemma-2-2b-it\\gemma-2-2b-it.safetensors', 'models-efficient-large-model--gemma-2-2b-it\\model-00001-of-00002.safetensors', 'models-efficient-large-model--gemma-2-2b-it\\model-00002-of-00002.safetensors']

No further activity. Just the image part is finished.

Image

I downloaded t5xxl_fp16.safetensors to ComfyUI\models\text_encoders\ and it still complains it is not there. Any idea what model needs to be downloaded to where to get around that Load CLIP issue?

SoftologyPro avatar Mar 30 '25 11:03 SoftologyPro

The work flow works normally on my side and I download the clip model and put them under ComfyUI/models/clip/text_encoders

lawrence-cj avatar Mar 31 '25 11:03 lawrence-cj

Which specific model file(s) did you download into ComfyUI/models/clip/text_encoders? From where?

SoftologyPro avatar Mar 31 '25 12:03 SoftologyPro

I don't remember. To my best knowledge, the T5xxl most people use are the same one. Give this a try

lawrence-cj avatar Mar 31 '25 12:03 lawrence-cj

https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors

I was downloading into the ComfyUI/models/text_encoders folder and not the ComfyUI/models/clip/text_encoders folder.

But even in both ComfyUI/models/text_encoders and ComfyUI/models/clip/text_encoders it still complains it is not found. Are you sure that is the required local folder for the file?

SoftologyPro avatar Mar 31 '25 12:03 SoftologyPro

It is really odd. When I load the worflow the Load CLIP model shows

Image

and when I run the node goes red and fails.

If I click the model and select text_encoders\t5xxl_fp16.safetensors then it works.

So a problem with back slash vs forward slash. Can that be fixed in the workflow?

SoftologyPro avatar Mar 31 '25 12:03 SoftologyPro

But with that fixed, now it complains about missing

Error no file named diffusion_pytorch_model.bin found in directory ComfyUI\ComfyUI\models\CogVideo\CogVideoX-Fun-V1.1-5b-InP.

SoftologyPro avatar Mar 31 '25 12:03 SoftologyPro

I don't think so. The workflow we tested is on Linux. I don't have a windows machine available to test if the workflow is still available if I change anything.

lawrence-cj avatar Mar 31 '25 12:03 lawrence-cj

Then download this one:

Image

lawrence-cj avatar Mar 31 '25 12:03 lawrence-cj

I don't think so. The workflow we tested is on Linux. I don't have a windows machine available to test if the workflow is still available if I change anything.

OK, the only reason I ask is that I had the request to support this in Visions of Chaos. https://softology.pro/voc.htm I download the workflows as is. That way if something is updated then when the user downloads it in VoC they get the latest version.

If the back vs forward slash is an issue I can run a search and replace on the workflow once I download it to fix them.

With the slashes fixed and models downloaded it now finally all works :)

Image

SoftologyPro avatar Mar 31 '25 12:03 SoftologyPro

Hello, no matter what I try I only get bad generations a black image or coloreed image, I mean even isolating only the T2I part.

Image

luisclement avatar Apr 29 '25 14:04 luisclement