flux-fp8-api `NotImplementedError: Cannot copy out of meta tensor; no data!`

Firstly, thanks to @aredden and all the contributors to this repo. Open-source work like this is so valuable. @prodialabs would love to support it any way possible.

Steps to Reproduce

Setup torch 2.4.0 with cuda 12.4 (our box has an h100)
Clone flux-fp8-api + pip install dependencies
Use huggingface-cli to download official BFL Flux Dev + VAE
- huggingface-cli download black-forest-labs/FLUX.1-dev flux1-dev.safetensors
- huggingface-cli download black-forest-labs/FLUX.1-dev vae/diffusion_pytorch_model.safetensors
Edit config configs/config-dev-1-RTX6000ADA.json to use ckpt_path and ae_path from downloaded BFL models
Run simple pipeline load
- from flux_pipeline import FluxPipeline
- pipeline = FluxPipeline.load_pipeline_from_config_path("configs/config-dev-1-RTX6000ADA.json")

       decoder.up_blocks.3.resnets.0.norm2.bias
        decoder.up_blocks.3.resnets.0.norm2.weight
        decoder.up_blocks.3.resnets.1.conv1.bias
        decoder.up_blocks.3.resnets.1.conv1.weight
        decoder.up_blocks.3.resnets.1.conv2.bias
        decoder.up_blocks.3.resnets.1.conv2.weight
        decoder.up_blocks.3.resnets.1.norm1.bias
        decoder.up_blocks.3.resnets.1.norm1.weight
        decoder.up_blocks.3.resnets.1.norm2.bias
        decoder.up_blocks.3.resnets.1.norm2.weight
        decoder.up_blocks.3.resnets.2.conv1.bias
        decoder.up_blocks.3.resnets.2.conv1.weight
        decoder.up_blocks.3.resnets.2.conv2.bias
        decoder.up_blocks.3.resnets.2.conv2.weight
        decoder.up_blocks.3.resnets.2.norm1.bias
        decoder.up_blocks.3.resnets.2.norm1.weight
        decoder.up_blocks.3.resnets.2.norm2.bias
        decoder.up_blocks.3.resnets.2.norm2.weight
Traceback (most recent call last):
  File "/root/flux-fp8-api/monty.py", line 3, in <module>
    pipeline = FluxPipeline.load_pipeline_from_config_path("configs/config-dev-1-RTX6000ADA.json")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/flux-fp8-api/flux_pipeline.py", line 665, in load_pipeline_from_config_path
    return cls.load_pipeline_from_config(config, debug=debug)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/flux-fp8-api/flux_pipeline.py", line 679, in load_pipeline_from_config
    models = load_models_from_config(config)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/flux-fp8-api/util.py", line 324, in load_models_from_config
    ae=load_autoencoder(config),
       ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/flux-fp8-api/util.py", line 282, in load_autoencoder
    ae.to(device=into_device(config.ae_device), dtype=into_dtype(config.ae_dtype))
  File "/root/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1174, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/root/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 780, in _apply
    module._apply(fn)
  File "/root/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 780, in _apply
    module._apply(fn)
  File "/root/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 780, in _apply
    module._apply(fn)
  [Previous line repeated 3 more times]
  File "/root/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 805, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "/root/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1167, in convert
    raise NotImplementedError(
NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.

Aug 29 '24 11:08 montyanderson

I used the following links to get the files with wget and had no issues:


urls=(
    "https://huggingface.co/Kijai/flux-fp8/resolve/main/flux1-schnell-fp8.safetensors"
    "https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors"
    "https://huggingface.co/Kijai/flux-fp8/resolve/main/flux1-dev-fp8.safetensors"
)

Aug 29 '24 15:08 XmYx

I used the following links to get the files with wget and had no issues:


urls=(
    "https://huggingface.co/Kijai/flux-fp8/resolve/main/flux1-schnell-fp8.safetensors"
    "https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors"
    "https://huggingface.co/Kijai/flux-fp8/resolve/main/flux1-dev-fp8.safetensors"
)

What's the benefit of using these weights vs. the full modelfile? Just speed?

Aug 29 '24 16:08 ashakoen

The error you were getting is the result of some state dict keys not matching. It could be because it's trying creating the guidance embed based on the config given, but the guidance embed doesn't exist for schnell, so it would remain on the "meta" device and throw that error. Also, if that's not it, diffusers weights wont work, since that's a different state dict. If you're using fp8 checkpoints from civit etc, those are quantized differently than in my repo, so you probably wont get as much precision, since I scale the weights into fp8 range rather than just directly convert to fp8. So if you want to use the "prequantized_flow" config, it wouldn't work with those, you have to save the state dict from the flow model in this repo after running at least ~30 steps so it can configure the input scales. @montyanderson

edit: Just realized that it's because you're using diffusers weights, which do have a separate state dict- so none of the keys will match and it will throw that error.

Aug 30 '24 22:08 aredden