diffusers Add GGUF loader for FluxTransformer2DModel

GGUF is becoming a preferred means of distribution of FLUX fine-tunes.

Transformers recently added general support for GGUF and are slowly adding support for additional model types. (implementation is by adding gguf_file param to from_pretrained method)

This PR adds support for loading GGUF files to T5EncoderModel. I've tested the code with quants available at https://huggingface.co/city96/t5-v1_1-xxl-encoder-gguf/tree/main and its working with current Flux implementation in diffusers.

However, as FluxTransformer2DModel is defined in diffusers library, support has to be added here to be able to load actual transformer model which is most (if not all) of Flux finetunes.

Examples that can be used:

https://civitai.com/models/657607/gguf-fastflux-flux1-schnell-merged-with-flux1-dev
with weights quantized as q4_0, q4_1, q5_0, q5_1
https://civitai.com/models/662958/flux1-dev-gguf-f16
with weights simply converted from f16

cc: @yiyixuxu @sayakpaul @DN6

Sep 20 '24 16:09 vladmandic

Perhaps after #9213.

Note that exotic FPX schemes are already supported (FP6, FP5, FP4) with torchao. Check out this repo for that: https://github.com/sayakpaul/diffusers-torchao

Sep 20 '24 17:09 sayakpaul

yes, i'm following that pr closely :) also, torchao work makes all this easier. request here is not to reimplement any of the quantization work done so far, but to add diffusers equivalent of transformers.modeling_gguf_pytorch_utils.load_gguf_checkpoint() which returns state_dict (with key re-mapping as needed) and then the rest of the load can be as-is.

Sep 20 '24 17:09 vladmandic

Yeah for sure. Thanks for following along!

Sep 20 '24 17:09 sayakpaul

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Oct 21 '24 15:10 github-actions[bot]

Right up our alley. Cc: @DN6

Oct 21 '24 15:10 sayakpaul

@sayakpaul @DN6 if you want to take a look...

simple implementation of generic gguf loader that loads state_dict: https://github.com/vladmandic/automatic/blob/dev/modules/ggml/__init__.py

and from there its simple to create diffusers class - i later use it to create FluxTransformer2DModel in https://github.com/vladmandic/automatic/blob/56ec09fac8db9fa01f2eeff8f955ef6c91f85451/modules/model_flux.py#L111

Oct 21 '24 15:10 vladmandic

GGUF is becoming a preferred means of distribution of FLUX fine-tunes.

Transformers recently added general support for GGUF and are slowly adding support for additional model types. (implementation is by adding gguf_file param to from_pretrained method)

This PR adds support for loading GGUF files to T5EncoderModel. I've tested the code with quants available at https://huggingface.co/city96/t5-v1_1-xxl-encoder-gguf/tree/main and its working with current Flux implementation in diffusers.

However, as FluxTransformer2DModel is defined in diffusers library, support has to be added here to be able to load actual transformer model which is most (if not all) of Flux finetunes.

Examples that can be used:

https://civitai.com/models/657607/gguf-fastflux-flux1-schnell-merged-with-flux1-dev with weights quantized as q4_0, q4_1, q5_0, q5_1

https://civitai.com/models/662958/flux1-dev-gguf-f16 with weights simply converted from f16

cc: @yiyixuxu @sayakpaul @DN6

Can you provide a simple demo to use the gguf format model in diffusers? I do not know exactly how to use the gguf model.

from modules.model_flux import load_flux_gguf
import torch, pdb
from diffusers import FluxPipeline



file_path = '/maindata/data/shared/public/yang.zhang/models/flux/flux-schnell-dev-merge-q4-1.gguf'
transformer, _ = load_flux_gguf(file_path)
dtype = torch.float16
bfl_repo = '/maindata/data/shared/public/yang.zhang/models/flux/FLUX.1-dev'
pipe = FluxPipeline.from_pretrained(bfl_repo, torch_dtype=dtype, transformer=transformer).to('cuda')


prompt = 'a cat'
cfg = 3.5
step = 30
image = pipe(
    prompt,
    height=1024,
    width=1024,
    guidance_scale=cfg,
    num_inference_steps=step,
    max_sequence_length=512,
    generator=torch.Generator("cpu").manual_seed(100),
).images[0]
image.save(f"res/flux-gguf_cfg{cfg}_step{step}.png")

I use above code but get error info

06:52:09-580434 INFO     Device detect: memory=79.3 optimization=none                                                                                                                                
06:52:09-592030 INFO     Engine: backend=Backend.DIFFUSERS compute=cuda device=cuda attention="Scaled-Dot-Product" mode=no_grad                                                                      
06:52:09-597271 ERROR    styles failed to migrate: file="styles.csv" error=partially initialized module 'modules.shared' has no attribute 'max_workers' (most likely due to a circular import)       
06:52:09-610222 INFO     Torch parameters: backend=cuda device=cuda config=Auto dtype=torch.bfloat16 vae=torch.bfloat16 unet=torch.bfloat16 context=no_grad nohalf=False nohalfvae=False             
                         upscast=False deterministic=False test-fp16=True test-bf16=True optimization="Scaled-Dot-Product"                                                                           
06:52:09-613630 ERROR    Package: ['onnx'] 'NoneType' object has no attribute 'working_set'                                                                                                          
06:52:09-643598 INFO     Device: device=NVIDIA A800-SXM4-80GB n=2 arch=sm_90 capability=(8, 0) cuda=12.1 cudnn=90100 driver=470.161.03                                                               
                         470.161.03                                                                                                                                                                  
06:52:09-661209 ERROR    Package: ['gguf'] 'NoneType' object has no attribute 'working_set'                                                                                                          
06:52:09-662821 INFO     Install: package="gguf" mode=pip                                                                                                                                            
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:22<00:00, 11.45s/it]
Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:24<00:00,  3.52s/it]
  0%|                                                                                                                                                                         | 0/30 [00:00<?, ?it/s]
╭──────────────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ────────────────────────────────────────────────────────────────────────────────╮
│ /maindata/data/shared/public/songtao.tian/flux/gguf/automatic/test_gguf.py:18 in <module>                                                                                                         │
│                                                                                                                                                                                                   │
│   17 step = 30                                                                                                                                                                                    │
│ ❱ 18 image = pipe(                                                                                                                                                                                │
│   19     prompt,                                                                                                                                                                                  │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/torch/utils/_contextlib.py:116 in decorate_context                                                                            │
│                                                                                                                                                                                                   │
│   115         with ctx_factory():                                                                                                                                                                 │
│ ❱ 116             return func(*args, **kwargs)                                                                                                                                                    │
│   117                                                                                                                                                                                             │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/diffusers/pipelines/flux/pipeline_flux.py:730 in __call__                                                                     │
│                                                                                                                                                                                                   │
│   729                                                                                                                                                                                             │
│ ❱ 730                 noise_pred = self.transformer(                                                                                                                                              │
│   731                     hidden_states=latents,                                                                                                                                                  │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/torch/nn/modules/module.py:1553 in _wrapped_call_impl                                                                         │
│                                                                                                                                                                                                   │
│   1552         else:                                                                                                                                                                              │
│ ❱ 1553             return self._call_impl(*args, **kwargs)                                                                                                                                        │
│   1554                                                                                                                                                                                            │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/torch/nn/modules/module.py:1562 in _call_impl                                                                                 │
│                                                                                                                                                                                                   │
│   1561                 or _global_forward_hooks or _global_forward_pre_hooks):                                                                                                                    │
│ ❱ 1562             return forward_call(*args, **kwargs)                                                                                                                                           │
│   1563                                                                                                                                                                                            │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/diffusers/models/transformers/transformer_flux.py:447 in forward                                                              │
│                                                                                                                                                                                                   │
│   446                 )                                                                                                                                                                           │
│ ❱ 447         hidden_states = self.x_embedder(hidden_states)                                                                                                                                      │
│   448                                                                                                                                                                                             │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/torch/nn/modules/module.py:1553 in _wrapped_call_impl                                                                         │
│                                                                                                                                                                                                   │
│   1552         else:                                                                                                                                                                              │
│ ❱ 1553             return self._call_impl(*args, **kwargs)                                                                                                                                        │
│   1554                                                                                                                                                                                            │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/torch/nn/modules/module.py:1562 in _call_impl                                                                                 │
│                                                                                                                                                                                                   │
│   1561                 or _global_forward_hooks or _global_forward_pre_hooks):                                                                                                                    │
│ ❱ 1562             return forward_call(*args, **kwargs)                                                                                                                                           │
│   1563                                                                                                                                                                                            │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/torch/nn/modules/linear.py:117 in forward                                                                                     │
│                                                                                                                                                                                                   │
│   116     def forward(self, input: Tensor) -> Tensor:                                                                                                                                             │
│ ❱ 117         return F.linear(input, self.weight, self.bias)                                                                                                                                      │
│   118                                                                                                                                                                                             │
│                                                                                                                                                                                                   │
│ /maindata/data/shared/public/songtao.tian/flux/gguf/automatic/modules/ggml/gguf_tensor.py:148 in __torch_dispatch__                                                                               │
│                                                                                                                                                                                                   │
│   147         if func in GGML_TENSOR_OP_TABLE:                                                                                                                                                    │
│ ❱ 148             return GGML_TENSOR_OP_TABLE[func](func, args, kwargs)                                                                                                                           │
│   149         else:                                                                                                                                                                               │
│                                                                                                                                                                                                   │
│ /maindata/data/shared/public/songtao.tian/flux/gguf/automatic/modules/ggml/gguf_tensor.py:18 in dequantize_and_run                                                                                │
│                                                                                                                                                                                                   │
│    17     }                                                                                                                                                                                       │
│ ❱  18     return func(*dequantized_args, **dequantized_kwargs)                                                                                                                                    │
│    19                                                                                                                                                                                             │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/torch/_ops.py:667 in __call__                                                                                                 │
│                                                                                                                                                                                                   │
│    666         # are named "self". This way, all the aten ops can be called by kwargs.                                                                                                            │
│ ❱  667         return self_._op(*args, **kwargs)                                                                                                                                                  │
│    668                                                                                                                                                                                            │
RuntimeError: mat1 and mat2 must have the same dtype, but got Half and BFloat16

Oct 25 '24 06:10 solitaryTian

GGUF is becoming a preferred means of distribution of FLUX fine-tunes. Transformers recently added general support for GGUF and are slowly adding support for additional model types. (implementation is by adding gguf_file param to from_pretrained method) This PR adds support for loading GGUF files to T5EncoderModel. I've tested the code with quants available at https://huggingface.co/city96/t5-v1_1-xxl-encoder-gguf/tree/main and its working with current Flux implementation in diffusers. However, as FluxTransformer2DModel is defined in diffusers library, support has to be added here to be able to load actual transformer model which is most (if not all) of Flux finetunes. Examples that can be used:

https://civitai.com/models/657607/gguf-fastflux-flux1-schnell-merged-with-flux1-dev with weights quantized as q4_0, q4_1, q5_0, q5_1

https://civitai.com/models/662958/flux1-dev-gguf-f16 with weights simply converted from f16

cc: @yiyixuxu @sayakpaul @DN6

Can you provide a simple demo to use the gguf format model in diffusers? I do not know exactly how to use the gguf model.

from modules.model_flux import load_flux_gguf
import torch, pdb
from diffusers import FluxPipeline



file_path = '/maindata/data/shared/public/yang.zhang/models/flux/flux-schnell-dev-merge-q4-1.gguf'
transformer, _ = load_flux_gguf(file_path)
dtype = torch.float16
bfl_repo = '/maindata/data/shared/public/yang.zhang/models/flux/FLUX.1-dev'
pipe = FluxPipeline.from_pretrained(bfl_repo, torch_dtype=dtype, transformer=transformer).to('cuda')


prompt = 'a cat'
cfg = 3.5
step = 30
image = pipe(
    prompt,
    height=1024,
    width=1024,
    guidance_scale=cfg,
    num_inference_steps=step,
    max_sequence_length=512,
    generator=torch.Generator("cpu").manual_seed(100),
).images[0]
image.save(f"res/flux-gguf_cfg{cfg}_step{step}.png")

I use above code but get error info

06:52:09-580434 INFO     Device detect: memory=79.3 optimization=none                                                                                                                                
06:52:09-592030 INFO     Engine: backend=Backend.DIFFUSERS compute=cuda device=cuda attention="Scaled-Dot-Product" mode=no_grad                                                                      
06:52:09-597271 ERROR    styles failed to migrate: file="styles.csv" error=partially initialized module 'modules.shared' has no attribute 'max_workers' (most likely due to a circular import)       
06:52:09-610222 INFO     Torch parameters: backend=cuda device=cuda config=Auto dtype=torch.bfloat16 vae=torch.bfloat16 unet=torch.bfloat16 context=no_grad nohalf=False nohalfvae=False             
                         upscast=False deterministic=False test-fp16=True test-bf16=True optimization="Scaled-Dot-Product"                                                                           
06:52:09-613630 ERROR    Package: ['onnx'] 'NoneType' object has no attribute 'working_set'                                                                                                          
06:52:09-643598 INFO     Device: device=NVIDIA A800-SXM4-80GB n=2 arch=sm_90 capability=(8, 0) cuda=12.1 cudnn=90100 driver=470.161.03                                                               
                         470.161.03                                                                                                                                                                  
06:52:09-661209 ERROR    Package: ['gguf'] 'NoneType' object has no attribute 'working_set'                                                                                                          
06:52:09-662821 INFO     Install: package="gguf" mode=pip                                                                                                                                            
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:22<00:00, 11.45s/it]
Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:24<00:00,  3.52s/it]
  0%|                                                                                                                                                                         | 0/30 [00:00<?, ?it/s]
╭──────────────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ────────────────────────────────────────────────────────────────────────────────╮
│ /maindata/data/shared/public/songtao.tian/flux/gguf/automatic/test_gguf.py:18 in <module>                                                                                                         │
│                                                                                                                                                                                                   │
│   17 step = 30                                                                                                                                                                                    │
│ ❱ 18 image = pipe(                                                                                                                                                                                │
│   19     prompt,                                                                                                                                                                                  │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/torch/utils/_contextlib.py:116 in decorate_context                                                                            │
│                                                                                                                                                                                                   │
│   115         with ctx_factory():                                                                                                                                                                 │
│ ❱ 116             return func(*args, **kwargs)                                                                                                                                                    │
│   117                                                                                                                                                                                             │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/diffusers/pipelines/flux/pipeline_flux.py:730 in __call__                                                                     │
│                                                                                                                                                                                                   │
│   729                                                                                                                                                                                             │
│ ❱ 730                 noise_pred = self.transformer(                                                                                                                                              │
│   731                     hidden_states=latents,                                                                                                                                                  │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/torch/nn/modules/module.py:1553 in _wrapped_call_impl                                                                         │
│                                                                                                                                                                                                   │
│   1552         else:                                                                                                                                                                              │
│ ❱ 1553             return self._call_impl(*args, **kwargs)                                                                                                                                        │
│   1554                                                                                                                                                                                            │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/torch/nn/modules/module.py:1562 in _call_impl                                                                                 │
│                                                                                                                                                                                                   │
│   1561                 or _global_forward_hooks or _global_forward_pre_hooks):                                                                                                                    │
│ ❱ 1562             return forward_call(*args, **kwargs)                                                                                                                                           │
│   1563                                                                                                                                                                                            │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/diffusers/models/transformers/transformer_flux.py:447 in forward                                                              │
│                                                                                                                                                                                                   │
│   446                 )                                                                                                                                                                           │
│ ❱ 447         hidden_states = self.x_embedder(hidden_states)                                                                                                                                      │
│   448                                                                                                                                                                                             │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/torch/nn/modules/module.py:1553 in _wrapped_call_impl                                                                         │
│                                                                                                                                                                                                   │
│   1552         else:                                                                                                                                                                              │
│ ❱ 1553             return self._call_impl(*args, **kwargs)                                                                                                                                        │
│   1554                                                                                                                                                                                            │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/torch/nn/modules/module.py:1562 in _call_impl                                                                                 │
│                                                                                                                                                                                                   │
│   1561                 or _global_forward_hooks or _global_forward_pre_hooks):                                                                                                                    │
│ ❱ 1562             return forward_call(*args, **kwargs)                                                                                                                                           │
│   1563                                                                                                                                                                                            │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/torch/nn/modules/linear.py:117 in forward                                                                                     │
│                                                                                                                                                                                                   │
│   116     def forward(self, input: Tensor) -> Tensor:                                                                                                                                             │
│ ❱ 117         return F.linear(input, self.weight, self.bias)                                                                                                                                      │
│   118                                                                                                                                                                                             │
│                                                                                                                                                                                                   │
│ /maindata/data/shared/public/songtao.tian/flux/gguf/automatic/modules/ggml/gguf_tensor.py:148 in __torch_dispatch__                                                                               │
│                                                                                                                                                                                                   │
│   147         if func in GGML_TENSOR_OP_TABLE:                                                                                                                                                    │
│ ❱ 148             return GGML_TENSOR_OP_TABLE[func](func, args, kwargs)                                                                                                                           │
│   149         else:                                                                                                                                                                               │
│                                                                                                                                                                                                   │
│ /maindata/data/shared/public/songtao.tian/flux/gguf/automatic/modules/ggml/gguf_tensor.py:18 in dequantize_and_run                                                                                │
│                                                                                                                                                                                                   │
│    17     }                                                                                                                                                                                       │
│ ❱  18     return func(*dequantized_args, **dequantized_kwargs)                                                                                                                                    │
│    19                                                                                                                                                                                             │
│                                                                                                                                                                                                   │
│ /home/songtao.tian/anaconda3/envs/gguf/lib/python3.10/site-packages/torch/_ops.py:667 in __call__                                                                                                 │
│                                                                                                                                                                                                   │
│    666         # are named "self". This way, all the aten ops can be called by kwargs.                                                                                                            │
│ ❱  667         return self_._op(*args, **kwargs)                                                                                                                                                  │
│    668                                                                                                                                                                                            │
RuntimeError: mat1 and mat2 must have the same dtype, but got Half and BFloat16

Nov 05 '24 03:11 zhaowendao30

I get the same error, do you solve it ?

Nov 05 '24 03:11 zhaowendao30

Set dtype = torch.bfloat16 in this demo. Then run the demo again. Locate the new error and set q, k, v to the same dtype

Nov 05 '24 05:11 solitaryTian

Set dtype = torch.bfloat16 in this demo. Then run the demo again. Locate the new error and set q, k, v to the same dtype

Hello, I use the gguf q5 model,but the GPU memory usage is higher. Is your GPU memory reduced?

Nov 05 '24 06:11 1273545169

Any progress here?

Nov 08 '24 10:11 chuck-ma

I successfully loaded the weights in gguf format, but only models with 0,1 suffixes work, those with K,S suffixes do not. (diffusers-0.31.0-dev)

import ggml
import torch
import torch.nn as nn

def load_flux_gguf(file_path, transformer_config, dtype, device):
    transformer = None
    # model_te.install_gguf()
    from accelerate import init_empty_weights
    from diffusers.loaders.single_file_utils import convert_flux_transformer_checkpoint_to_diffusers
    # from modules import ggml
    with init_empty_weights():
        from diffusers import FluxTransformer2DModel
        config = FluxTransformer2DModel.load_config(transformer_config)
        transformer = FluxTransformer2DModel.from_config(config).to(dtype)
        expected_state_dict_keys = list(transformer.state_dict().keys())
    state_dict, stats = ggml.load_gguf_state_dict(file_path, dtype)
    state_dict = convert_flux_transformer_checkpoint_to_diffusers(state_dict)
    applied, skipped = 0, 0
    for param_name, param in state_dict.items():
        if param_name not in expected_state_dict_keys:
            # shared.log.warning(f'Load model: type=Unet/Transformer param={param_name} unexpected')
            skipped += 1
            continue
        applied += 1
        hijack_set_module_tensor_simple(transformer, tensor_name=param_name, value=param, device=device)
        state_dict[param_name] = None
    # shared.log.debug(f'Load model: type=Unet/Transformer applied={applied} skipped={skipped} stats={stats}')
    return transformer, `None`


def hijack_set_module_tensor_simple(module,tensor_name,device,value):
    if "." in tensor_name:
        splits = tensor_name.split(".")
        for split in splits[:-1]:
            module = getattr(module, split)
        tensor_name = splits[-1]
    old_value = getattr(module, tensor_name)
    with torch.no_grad():
        if tensor_name in module._buffers:
            module._buffers[tensor_name] = value.to(device, non_blocking=True)
        elif value is not None:
            param_cls = type(module._parameters[tensor_name])
            module._parameters[tensor_name] = param_cls(value, requires_grad=False).to(device, non_blocking=True)

unet_path = '/yourpath/flux1-dev-Q8_0.gguf'
transformer_config = 'yourpath/flux-dev/transformer'
dtype = torch.float16
device = 'cuda:0'
gguf_transformer,_ = load_flux_gguf(unet_path, transformer_config, dtype, device)

import torch
from diffusers import FluxTransformer2DModel, FluxPipeline
from transformers import T5EncoderModel, CLIPTextModel
from optimum.quanto import freeze, qfloat8, quantize, qint8, qint4

dtype = torch.float16

pipe = FluxPipeline.from_pretrained("/yourpath/flux-dev", torch_dtype=dtype)

pipe.transformer = gguf_transformer
pipe.to('cuda:0')

prompt = "minimalism,Chinese ink painting,ink painting,close-up,1girl,solo,portrait,closed_eyes,eyeshadow,gloves,makeup,lipstick jewelry,earrings,necklace,hat,long hair,dress,high qulity,extremely detaile,offcial art,Uniform 8K wallpaper,super detailing,32K,"
image = pipe(
    prompt,
    guidance_scale=3.5,
    output_type="pil",
    num_inference_steps=20,
    generator=torch.Generator("cpu").manual_seed(1024)
).images[0]

Nov 11 '24 03:11 zhaowendao30

@zhaowendao30 thanks for this!

Could you maybe modify your comment to include ggml installation instruction and the checkpoint you used?

Nov 11 '24 11:11 sayakpaul

@zhaowendao30 thanks for this!

Could you maybe modify your comment to include ggml installation instruction and the checkpoint you used? ggml repo is https://github.com/vladmandic/automatic/blob/dev/modules/ggml, copy it to your local path

Nov 19 '24 01:11 zhaowendao30

Thanks! And the checkpoint you used?

Nov 19 '24 01:11 sayakpaul

Thanks! And the checkpoint you used?

https://huggingface.co/city96/FLUX.1-dev-gguf/tree/main, only models with 0,1 suffixes work, those with K,S suffixes do not.

Nov 19 '24 06:11 zhaowendao30

just one word of caution - the code relies on gguf package which has a really bad installer - see https://github.com/ggerganov/llama.cpp/issues/9566

Nov 19 '24 12:11 vladmandic

Being worked in https://github.com/huggingface/diffusers/pull/9964

Dec 04 '24 10:12 sayakpaul

Closing since #9964 was merged. Feel free to reopen if there are any issues.

Dec 18 '24 10:12 DN6