ComfyUI
ComfyUI copied to clipboard
Trying to convert Float8_e4m3fn to the MPS backend but it does not have support for that dtype. SD3.5 FP8 Mac M2
Expected Behavior
To render the image
Actual Behavior
no image it bugs
Steps to Reproduce
Using the ComfyUI workflow present in wiki page
Debug Logs
I can not put all the Debug log it is too long
2024-11-07 23:08:51,027 - root - INFO - Total VRAM 16384 MB, total RAM 16384 MB
2024-11-07 23:08:51,027 - root - INFO - pytorch version: 2.4.1
2024-11-07 23:08:51,027 - root - INFO - Set vram state to: SHARED
2024-11-07 23:08:51,027 - root - INFO - Device: mps
2024-11-07 23:08:51,707 - root - INFO - Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
2024-11-07 23:08:52,888 - root - INFO - [Prompt Server] web root: /Volumes/mac_disk/AI/ComfyUI/web
2024-11-07 23:08:53,559 - root - WARNING - Traceback (most recent call last):
File "/Volumes/mac_disk/AI/ComfyUI/nodes.py", line 2012, in load_custom_node
module_spec.loader.exec_module(module)
File "<frozen importlib._bootstrap_external>", line 990, in exec_module
File "<frozen importlib._bootstrap_external>", line 1127, in get_code
File "<frozen importlib._bootstrap_external>", line 1185, in get_data
FileNotFoundError: [Errno 2] No such file or directory: '/Volumes/mac_disk/AI/ComfyUI/custom_nodes/clipseg/__init__.py'
2024-11-07 23:08:53,560 - root - WARNING - Cannot import /Volumes/mac_disk/AI/ComfyUI/custom_nodes/clipseg module for custom nodes: [Errno 2] No such file or directory: '/Volumes/mac_disk/AI/ComfyUI/custom_nodes/clipseg/__init__.py'
2024-11-07 23:08:55,756 - root - INFO - Total VRAM 16384 MB, total RAM 16384 MB
2024-11-07 23:08:55,756 - root - INFO - pytorch version: 2.4.1
2024-11-07 23:08:55,756 - root - INFO - Set vram state to: SHARED
2024-11-07 23:08:55,756 - root - INFO - Device: mps
2024-11-07 23:09:05,211 - root - INFO -
Import times for custom nodes:
2024-11-07 23:09:05,218 - root - INFO - Starting server
2024-11-07 23:09:05,218 - root - INFO - To see the GUI go to: http://127.0.0.1:8188
2024-11-07 23:09:26,847 - root - INFO - got prompt
2024-11-07 23:09:27,368 - root - INFO - Using split attention in VAE
2024-11-07 23:09:27,369 - root - INFO - Using split attention in VAE
2024-11-07 23:09:33,750 - root - INFO - model weight dtype torch.bfloat16, manual cast: None
2024-11-07 23:09:33,754 - root - INFO - model_type FLOW
2024-11-07 23:09:40,701 - root - INFO - Requested to load FluxClipModel_
2024-11-07 23:09:40,702 - root - INFO - Loading 1 new model
2024-11-07 23:09:40,705 - root - INFO - loaded completely 0.0 323.94775390625 True
2024-11-07 23:09:42,747 - root - INFO - Requested to load FluxClipModel_
2024-11-07 23:09:42,747 - root - INFO - Loading 1 new model
2024-11-07 23:11:19,126 - root - INFO - Requested to load Flux
2024-11-07 23:11:19,127 - root - INFO - Loading 1 new model
2024-11-07 23:12:45,079 - root - INFO - loaded completely 0.0 7880.297119140625 True
2024-11-07 23:17:57,484 - root - INFO - got prompt
2024-11-07 23:20:42,637 - root - INFO - Processing interrupted
2024-11-07 23:20:42,647 - root - INFO - Prompt executed in 675.79 seconds
2024-11-07 23:22:15,177 - root - INFO - Unloading models for lowram load.
2024-11-07 23:22:43,277 - root - INFO - 1 models unloaded.
2024-11-07 23:22:43,401 - root - INFO - Loading 1 new model
2024-11-07 23:22:59,452 - root - INFO - loaded completely 0.0 7880.297119140625 True
2024-11-07 23:35:20,213 - root - INFO - Requested to load AutoencodingEngine
2024-11-07 23:35:20,225 - root - INFO - Loading 1 new model
2024-11-07 23:35:58,888 - root - INFO - loaded completely 0.0 319.7467155456543 True
2024-11-07 23:36:10,025 - root - INFO - Prompt executed in 924.68 seconds
2024-11-07 23:44:30,055 - root - INFO - got prompt
2024-11-07 23:45:55,950 - root - INFO - Unloading models for lowram load.
2024-11-07 23:45:56,125 - root - INFO - 1 models unloaded.
2024-11-07 23:45:56,125 - root - INFO - Loading 1 new model
2024-11-07 23:46:13,772 - root - INFO - loaded completely 0.0 7880.297119140625 True
2024-11-08 00:14:36,066 - root - INFO - Requested to load AutoencodingEngine
2024-11-08 00:14:36,070 - root - INFO - Loading 1 new model
2024-11-08 00:15:31,253 - root - INFO - loaded completely 0.0 319.7467155456543 True
2024-11-08 00:16:16,545 - root - INFO - Prompt executed in 1906.03 seconds
2024-11-08 00:20:47,717 - root - INFO - got prompt
2024-11-08 00:22:16,020 - root - INFO - Unloading models for lowram load.
2024-11-08 00:22:16,106 - root - INFO - 1 models unloaded.
2024-11-08 00:22:16,106 - root - INFO - Loading 1 new model
2024-11-08 00:22:38,298 - root - INFO - loaded completely 0.0 7880.297119140625 True
2024-11-08 00:22:53,124 - root - INFO - got prompt
2024-11-08 00:22:53,547 - root - ERROR - Failed to validate prompt for output 9:
2024-11-08 00:22:53,547 - root - ERROR - * CheckpointLoaderSimple 4:
2024-11-08 00:22:53,547 - root - ERROR - - Value not in list: ckpt_name: 'sd3.5_large_fp8_scaled.safetensors' not in (list of length 82)
2024-11-08 00:22:53,547 - root - ERROR - Output will be ignored
2024-11-08 00:22:53,547 - root - WARNING - invalid prompt: {'type': 'prompt_outputs_failed_validation', 'message': 'Prompt outputs failed validation', 'details': '', 'extra_info': {}}
2024-11-08 00:23:41,232 - root - INFO - got prompt
2024-11-08 00:45:14,511 - root - INFO - Requested to load AutoencodingEngine
2024-11-08 00:45:14,519 - root - INFO - Loading 1 new model
2024-11-08 00:46:03,350 - root - INFO - loaded completely 0.0 319.7467155456543 True
2024-11-08 00:46:40,530 - root - INFO - Prompt executed in 1552.44 seconds
2024-11-08 00:46:53,921 - root - INFO - model weight dtype torch.float16, manual cast: None
2024-11-08 00:46:53,927 - root - INFO - model_type FLOW
2024-11-08 00:49:06,020 - root - INFO - Using split attention in VAE
2024-11-08 00:49:06,026 - root - INFO - Using split attention in VAE
2024-11-08 00:49:10,767 - root - INFO - Requested to load SD3ClipModel_
2024-11-08 00:49:10,767 - root - INFO - Loading 1 new model
2024-11-08 00:49:10,775 - root - INFO - loaded completely 0.0 6228.190093994141 True
2024-11-08 00:51:12,123 - root - INFO - Requested to load SD3ClipModel_
2024-11-08 00:51:12,123 - root - INFO - Loading 1 new model
2024-11-08 00:51:12,131 - root - INFO - loaded completely 0.0 6102.49609375 True
2024-11-08 00:51:20,350 - root - WARNING - clip missing: ['text_projection.weight']
2024-11-08 00:54:36,959 - root - INFO - Requested to load SD3
2024-11-08 00:54:37,017 - root - INFO - Loading 1 new model
2024-11-08 00:55:28,542 - root - INFO - loaded completely 0.0 7683.561706542969 True
2024-11-08 00:55:30,326 - root - ERROR - !!! Exception during processing !!! Trying to convert Float8_e4m3fn to the MPS backend but it does not have support for that dtype.
2024-11-08 00:55:30,608 - root - ERROR - Traceback (most recent call last):
File "/Volumes/mac_disk/AI/ComfyUI/execution.py", line 323, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
Other
No response
Why do I have a feeling that nobody wants to solve this problem? A million messages on the internet about this error and nobody solves anything(.
Why do I have a feeling that nobody wants to solve this problem? A million messages on the internet about this error and nobody solves anything(.
I have maybe an idea for some reason it seems on mac you can use only gguf model and not the native model for sd3.5 and flux.
Same issue. Native Flux model. Will try with GGUF
Good news Now I found a way to play SD3.5 on mac without this problem Write this for starting the server PYTORCH_ENABLE_MPS_FALLBACK=1 python3 main.py
For sure it is slow very slow but it works
same error, when use flux1-dev-fp8 with fp8_e4m3fn, but if change the weight dtype to "default"(fp16) it can works
Same error with the Flux Schnell Sample Workflow in the Mac Desktop App version 0.3.28 arm64.
Same error. And the above solutions are not working for me
Same error. And the above solutions are not working for me
Not using the SD3 model at the moment, but had the same issue initally when pulling from an existing workflow for pixelwave flux model , but changed the dtype to default, and w/ the dual clip mixed.
same error, when use flux1-dev-fp8 with fp8_e4m3fn, but if change the weight dtype to "default"(fp16) it can works
This worked for me. Thank you!
同样的错误,当使用 flux1-dev-fp8 与 fp8_e4m3fn 时,但如果将权重 dtype 更改为“default”(fp16)它可以工作
的确是如此,但是Apple M1 Max/64GB配置,跑一个不算复杂的工作流,直接黑屏重启!而且很慢很慢!如果没有其他有效的方法,可能在mac上就彻底放弃flux了。
I reviewed the complaints wrt float8 errors on MPS. I found it is a relatively easy problem to fix. What I did was convert all the problematic data types found in a large safetensors file, to datatypes that are more palatable to MPS. i.e. float8 or bfloat16 they like to use.
I am thinking. Now that I have my convert2MPS working, why can't I put it into a Comfy custom node? Have the conversion done when loading the model. I have other issues with Comfy but if I can get past them, then maybe I'll give it a try.
Hi gcr-cormac,
I saw your solution to the “Trying to convert Float8_e4m3fn to the MPS backend” error in ComfyUI. I’m using an M1 MacBook Pro, but I’m not very familiar with programming or coding.
Could you please provide step-by-step instructions on how to fix this issue on macOS? I would really appreciate a simple explanation, including any commands I need to run in the Terminal.
Thank you so much for your help! 😊
@gcr-cormac have you tried it , and was it the best option ?
cc : @kidbbc
same error, when use flux1-dev-fp8 with fp8_e4m3fn, but if change the weight dtype to "default"(fp16) it can works
This worked for me. Thank you!
how to change?
same error, when use flux1-dev-fp8 with fp8_e4m3fn, but if change the weight dtype to "default"(fp16) it can works同样的错误,当使用Flux1-dev-fp8与fp8_e4m3fn,但如果更改权重dtype为默认(fp16)它可以工作
Thank you.It works for me.
@yujianzhong3129 我分别在MacBook Pro M1 Max 32GB和Mac Studio M2 ultra 192GB上跑了一下,发现生成纯黑色图片的原因是爆显存了,你试一试GGUF量化版本的FLUX模型.
总结一下:如果出现FP8_e4m3fn to the MPS backend but it does not have support for that type报错,这是因为原生的FLUX和SD模型对Mac的M芯片支持不好,将FLUX1模型的权重dtype改为fp16可以解决这个问题(因为目前只有Nvidia最新的显卡支持FP8精度),但是这样做会增加显存占用,可能会导致生成纯黑色图片。如果显存爆了换量化了的更小的模型。
I got this issue on a M2 MacBook air with 24GB of ram. Here is how I solved it:
Get a gguf model, like:
https://civitai.com/models/647237?modelVersionId=725532
I am using 4.1 comfyui uses about 6GB of ram, down from 40+, each image takes about 40 minutes to generate.
Put the model file in the unet folder.
I used comfyui manager to add several modules the used the gguf keyword, one of them must have worked.
Use the gguf loader to load the model: nodes, gguf, gguf loader
also use the gguf dual clip loader.
I read somewhere that the F8 models need M3 or later. T5/t5xxl_fp16.safetensors worked for me.
also: To get comfy to work, after running installer, cd to the directory then run: python -m pip install -r requirements.txt
Per the instructions I had already run: curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh sh Miniconda3-latest-MacOSX-arm64.sh -u conda install pytorch torchvision torchaudio -c pytorch-nightly pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
Should we run the whole code: curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh sh Miniconda3-latest-MacOSX-arm64.sh -u conda install pytorch torchvision torchaudio -c pytorch-nightly pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
same error, when use flux1-dev-fp8 with fp8_e4m3fn, but if change the weight dtype to "default"(fp16) it can works
thanks it works well for me :)
I got this issue on a M2 MacBook air with 24GB of ram. Here is how I solved it: Get a gguf model, like: https://civitai.com/models/647237?modelVersionId=725532 I am using 4.1 comfyui uses about 6GB of ram, down from 40+, each image takes about 40 minutes to generate. Put the model file in the unet folder. I used comfyui manager to add several modules the used the gguf keyword, one of them must have worked. Use the gguf loader to load the model: nodes, gguf, gguf loader also use the gguf dual clip loader. I read somewhere that the F8 models need M3 or later. T5/t5xxl_fp16.safetensors worked for me.
also: To get comfy to work, after running installer, cd to the directory then run: python -m pip install -r requirements.txt
Per the instructions I had already run: curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh sh Miniconda3-latest-MacOSX-arm64.sh -u conda install pytorch torchvision torchaudio -c pytorch-nightly pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
I have an m4 and I am running into the same issue
As all I known, the root cause is MPS incompatible with Float8_e4m3fn
On my M1, I run Comfy with the command python main.py --force-fp16 --use-split-cross-attention --cpu, and it works well for me(But you should know, it runs very very very slowly).
here is reference blog: Solution
My workaround on Mac M2:
- download a "GGUF" conversion of the model
- put the model's .gguf file in directory ComfyUI/models/unet
- replace the "Load Diffusion Model" node with "Unet Loader (GGUF)"
Then the workflow can run smoothly on GPU.
Hi @h12w, I've got a M3 Mac. I'm new to ComfyUI, trying to get V2V running with the Hunyuan video wrapper and facing the fp8 issue.
I'm trying your method using GGUF files but need a little (ok lots of) help with hooking things up. If you have a json file to share, that would be great.
Any links shared is much appreciated. :)
https://studywarehouse.com/solution-trying-to-convert-float8_e4m3fn-to-the-mps-backend-but-it-does-not-have-support-for-that-dtype/
https://github.com/comfyanonymous/ComfyUI/issues/6995#issuecomment-3024875418 this solution worked for me