FLUX Issue | MPS framework doesn't support float64
Expected Behavior
Run the inference
Actual Behavior
After 273.31 seconds, it throws an exception
Steps to Reproduce
Upload the example workflow for DEV version https://comfyanonymous.github.io/ComfyUI_examples/flux/
Debug Logs
!!! Exception during processing!!! Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
Traceback (most recent call last):
File "/Users/alexgenovese/Desktop/2_comfy/execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy_extras/nodes_custom_sampler.py", line 612, in sample
samples = guider.sample(noise.generate_noise(latent), latent_image, sampler, sigmas, denoise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=noise.seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy/samplers.py", line 716, in sample
output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy/samplers.py", line 695, in inner_sample
samples = sampler.sample(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy/samplers.py", line 600, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy/k_diffusion/sampling.py", line 143, in sample_euler
denoised = model(x, sigma_hat * s_in, **extra_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy/samplers.py", line 299, in __call__
out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy/samplers.py", line 682, in __call__
return self.predict_noise(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy/samplers.py", line 685, in predict_noise
return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy/samplers.py", line 279, in sampling_function
out = calc_cond_batch(model, conds, x, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/custom_nodes/ComfyUI-TiledDiffusion/.patches.py", line 4, in calc_cond_batch
return calc_cond_batch_original_tiled_diffusion_91e66834(model, conds, x_in, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy/samplers.py", line 228, in calc_cond_batch
output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/custom_nodes/ComfyUI-Advanced-ControlNet/adv_control/utils.py", line 64, in apply_model_uncond_cleanup_wrapper
return orig_apply_model(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy/model_base.py", line 121, in apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy/ldm/flux/model.py", line 135, in forward
out = self.forward_orig(img, img_ids, context, txt_ids, timestep, y, guidance)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy/ldm/flux/model.py", line 112, in forward_orig
pe = self.pe_embedder(ids)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy/ldm/flux/layers.py", line 21, in forward
[rope(ids[..., i], self.axes_dim[i], self.theta) for i in range(n_axes)],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/alexgenovese/Desktop/2_comfy/comfy/ldm/flux/math.py", line 16, in rope
scale = torch.arange(0, dim, 2, dtype=torch.float64, device=pos.device) / dim
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
Other
No response
https://github.com/comfyanonymous/ComfyUI/commit/48eb1399c02bdae7e14b2208c448b69b382d0090
can you check if this fixes it.
On Mac it seems to run with default settings, but just gets a black image output. If I change it fp8 as mentioned above then Mac says MPS doesn't support that.
On Mac it seems to run with default settings, but just gets a black image output. If I change it fp8 as mentioned above then Mac says MPS doesn't support that.
How much RAM do you have? For some reason both original and fp8 models are taking around 40+gb. Is it the same for you?
@tombearx I have a 64 GB M1 Mac and a 16 GB 3080 on my Windows machine. Use the Mac more at work so was trying there first.
It probably won't help fix it. But when I enable preview, I can see that as the image is generating, it is adding new stripes to the top of the image and the actual image may be shifting down by a corresponding amount.
I also am running on an M3 Max with 128GB ram. Flux won't run at 8-bit at all, comfy gives an error. The T5 model runs at 8 or 16 but that doesn't help with this issue. I updated pytorch to the current daily build of 2.5.0 wich also did not help.
If you're trying to run this model on a Apple Silicon Mac and having issues with broken image outputs, try downgrading torch with pip install torch==2.3.1 torchaudio==2.3.1 torchvision==0.18.1 as it seems that the latest stable version of torch has some bugs that break image generation. Here is what I get with the unmodified example workflow on a 64GB M1 Max with torch 2.3.1, using the latest ComfyUI commit as of this post and the Flux Dev model (with the fp16 T5 text encoder, t5xxl_fp16.safetensors):
this workflow is working on my m3/128 https://civitai.com/models/617060/comfyui-workflow-for-flux-simple
OK guys i pruned the weights, theyre now 11GB and no quality loss, it loads up faster, takes way less space in VRAM... Not sure why they were not relased pruned this way. They are loaded in 8bit still tho, i believe should be in 16, can fp16 be enabled in loader as well ? Cause when i tried to add fp16 on my own, i think it loaded as default and generation was very slow... compared to 8.
class UNETLoader: @classmethod def INPUT_TYPES(s): return {"required": { "unet_name": (folder_paths.get_filename_list("unet"), ), "weight_dtype": (["default", "fp16", "fp8_e4m3fn", "fp8_e5m2"],) }} RETURN_TYPES = ("MODEL",) FUNCTION = "load_unet"
CATEGORY = "advanced/loaders" def load_unet(self, unet_name, weight_dtype): weight_dtype = {"default": None, "fp16": torch.float16, "fp8_e4m3fn": torch.float8_e4m3fn, "fp8_e5m2": torch.float8_e4m3fn}[weight_dtype] unet_path = folder_paths.get_full_path("unet", unet_name) model = comfy.sd.load_unet(unet_path, dtype=weight_dtype) return (model,)
Can you explain how to prune it? Where to add? Sorry if it is noob question.
** Sorry, I git pulled and checked the code. All clear. Thanks!
Well, first shot did not work. I am on torch 2.3.1, Mac M2, 24GB. I loaded the schnell, fp8_e4m3fn. As is seen it does not use MPS and triggered a 5GB swap. I think I will wait for fixes to flow in.
Requested to load Flux Loading 1 new model python(4803) MallocStackLogging: can't turn off malloc stack logging because it was not enabled. 0%| | 0/4 [00:00<?, ?it/s]huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
0%| | 0/4 [00:04<?, ?it/s]
!!! Exception during processing!!! Trying to convert Float8_e4m3fn to the MPS backend but it does not have support for that dtype.
Traceback (most recent call last):
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy_extras/nodes_custom_sampler.py", line 612, in sample
samples = guider.sample(noise.generate_noise(latent), latent_image, sampler, sigmas, denoise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=noise.seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/samplers.py", line 716, in sample
output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/samplers.py", line 695, in inner_sample
samples = sampler.sample(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/samplers.py", line 600, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/comfyui/lib/python3.11/site-packages/torch/utils/contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/k_diffusion/sampling.py", line 143, in sample_euler
denoised = model(x, sigma_hat * s_in, **extra_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/samplers.py", line 299, in call
out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/samplers.py", line 682, in call
return self.predict_noise(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/samplers.py", line 685, in predict_noise
return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/samplers.py", line 279, in sampling_function
out = calc_cond_batch(model, conds, x, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/samplers.py", line 228, in calc_cond_batch
output = model.apply_model(input_x, timestep, **c).chunk(batch_chunks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/model_base.py", line 121, in apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/comfyui/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/comfyui/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/ldm/flux/model.py", line 143, in forward
out = self.forward_orig(img, img_ids, context, txt_ids, timestep, y, guidance)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/ldm/flux/model.py", line 101, in forward_orig
img = self.img_in(img)
^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/comfyui/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/comfyui/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/ops.py", line 63, in forward
return self.forward_comfy_cast_weights(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/ops.py", line 58, in forward_comfy_cast_weights
weight, bias = cast_bias_weight(self, input)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/ops.py", line 39, in cast_bias_weight
bias = cast_to(s.bias, dtype, device, non_blocking=non_blocking)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Volumes/d/apps/sdxl/comfi/ComfyUI/comfy/ops.py", line 24, in cast_to
return weight.to(device=device, dtype=dtype, non_blocking=non_blocking)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Trying to convert Float8_e4m3fn to the MPS backend but it does not have support for that dtype.
Prompt executed in 218.28 seconds`
If you're trying to run this model on a Apple Silicon Mac and having issues with broken image outputs, try downgrading torch with
pip install torch==2.3.1 torchaudio==2.3.1 torchvision==0.18.1as it seems that the latest stable version of torch has some bugs that break image generation.
Yep. This is the way. Downgrading to these versions fixes generation for me on my m3 max based macbook.
~~Still no luck yet on my M1 Max even after the torch downgrades.~~ I take that back. Just pulled latest from this morning (just the clip_l encoder change?), and that combined with the earlier torch downgrade did fix it.
the latest MPS nightly is working for me.
Unless pytorch support Float8_e4m3fn dtypes for MPS backends, people with less than 32GB unified memory should forget to run these locally on Apple Silicon.
the latest MPS nightly is working for me.
Nightly is still broken for me. 2.3 downgrade works.
I tried latest nightly. It "works" when using the normal cfgguider node, but is extremely blurry. Using basic guider + flux guidance node leads to noise.
[Edit]
Confirmed that downgraded torch does work, though you need basic guider + flux guidance node. Cfgguider node still produces blurry output.
(Image pairs differ in scheduler between euler and bosh3 (custom ODE scheduler).)
Has anyone seen value to the new guider for flux? If so I will downgrade to try it. With the nightly I'm getting nice output with guidance of 1.
Unless pytorch support Float8_e4m3fn dtypes for MPS backends, people with less than 32GB unified memory should forget to run these locally on Apple Silicon.
Can't manage to run it even on a 32GB M1 Max. Has anyone succeed?
@twalderman ~~I just tested and there might be something wrong with the guidance. I'm not seeing any difference between scale 1.0 and scale 4.5. Literally zero, when I subtract one image from the other.~~ Nevermind, comfy messed up somehow. How exactly did you get things working with torch nightlies?
I didnt do anything unusual. I tested with the nightly and had no issues so I didnt revert back again. I have been generating images all day.
@twalderman Weird. What OS version are you using?
Here is an example of the differences you could expect from changing the guidance scale (1.0 - 4.0 in steps of 0.5; 4.5 is above; all using bosh3 sampler).
can you share your workflow? On my Max M1 it runs for 10 min and the pic is noisy.
can you share your workflow? On my Max M1 it runs for 10 min and the pic is noisy.
I used workflow from previous picture. I have around 90-100s/it probably because bf16 is not supported directly and model using much more ram (and swap) than it should.
Unless pytorch support Float8_e4m3fn dtypes for MPS backends, people with less than 32GB unified memory should forget to run these locally on Apple Silicon.
Can't manage to run it even on a 32GB M1 Max. Has anyone succeed?
It is a bit of a bad situation for us. I am at 24G cannot even dream.
@Adreitz i am using the latest sequoia beta.
Unless pytorch support Float8_e4m3fn dtypes for MPS backends, people with less than 32GB unified memory should forget to run these locally on Apple Silicon.
Can't manage to run it even on a 32GB M1 Max. Has anyone succeed?
It is a bit of a bad situation for us. I am at 24G cannot even dream.
Looks like RAM issue arise due to the fact that text encoders hasn't unloaded from RAM on MPS. I opened the issue: https://github.com/comfyanonymous/ComfyUI/issues/4201
If you're trying to run this model on a Apple Silicon Mac and having issues with broken image outputs, try downgrading torch with
pip install torch==2.3.1 torchaudio==2.3.1 torchvision==0.18.1as it seems that the latest stable version of torch has some bugs that break image generation. Here is what I get with the unmodified example workflow on a 64GB M1 Max with torch 2.3.1, using the latest ComfyUI commit as of this post and the Flux Dev model (with the fp16 T5 text encoder,t5xxl_fp16.safetensors):
perfect solution !
how long does it takes to generate 1 image? Mine takes 10 min
how long does it takes to generate 1 image? Mine takes 10 min
M3 max 64gb takes 210s (1024x1024 30 steps)
how long does it takes to generate 1 image? Mine takes 10 min
Just under 5 min on M2 Max 64gb at 1024 @ 20 steps
i have m1 max 32gb but it still takes 10 min
@dreamrec can you share the workflow again? its private so i cant see it.
