nunchaku [Bug] Fail to load "fp4-shuttle-jaguar/fp4-flux.1-schnell"

Checklist

[x] 1. I have searched for related issues and FAQs (https://github.com/mit-han-lab/nunchaku/blob/main/docs/faq.md) but was unable to find a solution.
[x] 2. The issue persists in the latest version.
[x] 3. Please note that without environment information and a minimal reproducible example, it will be difficult for us to reproduce and address the issue, which may delay our response.
[ ] 4. If your report is a question rather than a bug, please submit it as a discussion at https://github.com/mit-han-lab/nunchaku/discussions/new/choose. Otherwise, this issue will be closed.
[ ] 5. If this is related to ComfyUI, please report it at https://github.com/mit-han-lab/ComfyUI-nunchaku/issues.
[x] 6. I will do my best to describe the issue in English.

Describe the Bug

using default example workflow, it works well for svdq-fp4-flux-dev/flux-fill, but not for fp4-shuttle-jaguar/fp4-flux.1-schnell

GPU 0 (NVIDIA GeForce RTX 5060 Ti) Memory: 16310.5625 MiB VRAM > 14GiB，disable CPU offload [2025-05-08 13:47:49.674] [info] Initializing QuantizedFluxModel on device 0 [2025-05-08 13:47:49.719] [info] Loading weights from E:\comfy\ComfyUI\models\diffusion_models\svdq-fp4-shuttle-jaguar\transformer_blocks.safetensors [2025-05-08 13:47:49.724] [warning] Failed to load safetensors using method READ: CUDA error: invalid argument (at E:\comfy\nunchaku\src\Tensor.h:77) !!! Exception during processing !!! Failed to load safetensors Traceback (most recent call last): File "E:\comfy\ComfyUI\execution.py", line 347, in execute output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\comfy\ComfyUI\execution.py", line 222, in get_output_data return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\comfy\ComfyUI\execution.py", line 194, in _map_node_over_list process_inputs(input_dict, i) File "E:\comfy\ComfyUI\execution.py", line 183, in process_inputs results.append(getattr(obj, func)(**inputs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\comfy\ComfyUI\custom_nodes\ComfyUI-nunchaku\nodes\models\flux.py", line 313, in load_model self.transformer = NunchakuFluxTransformer2dModel.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\comfy\python\Lib\site-packages\huggingface_hub\utils_validators.py", line 114, in _inner_fn return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "e:\comfy\nunchaku\nunchaku\models\transformers\transformer_flux.py", line 350, in from_pretrained m = load_quantized_module( ^^^^^^^^^^^^^^^^^^^^^^ File "e:\comfy\nunchaku\nunchaku\models\transformers\transformer_flux.py", line 270, in load_quantized_module m.load(path) RuntimeError: Failed to load safetensors

Prompt executed in 1.03 seconds

Environment

win10+python3.12+torch2.7+cu12.8+RTX5060ti

Reproduction Steps

using default example workflow, it works well for svdq-fp4-flux-dev/flux-fill, but not for fp4-shuttle-jaguar/mit-han-lab/svdq-fp4-flux.1-schnell

May 08 '25 06:05 zhgu-dev

Did you download the entire folder for fp4-shuttle-jaguar?

May 20 '25 05:05 lmxyy

Starting from v0.3.0, you can simply use the safetensors in https://huggingface.co/mit-han-lab/nunchaku-shuttle-jaguar. FP4 is for 50-series GPUs. For other GPUs, please use INT4.

Jun 07 '25 00:06 lmxyy