hlky
hlky
I've added `# Copied from` statement and removed imports from z-image transformer
[`alibaba-pai/Z-Image-Turbo-Fun-Controlnet-Union-2.0`](https://huggingface.co/alibaba-pai/Z-Image-Turbo-Fun-Controlnet-Union-2.0) is released, https://github.com/huggingface/diffusers/pull/12792/commits/04388f4698b303785b26bec6179a55aea652a388 should be ok for the [modeling changes](https://github.com/aigc-apps/VideoX-Fun/commit/3c0f159e016c96fad9b4cab9967aaf627ffdfedd), will add inpaint pipeline and test inference later. Loading v2 checkpoint is tested: ```python import torch from diffusers import...
1.0 ```python import torch from diffusers import ZImageControlNetPipeline from diffusers import ZImageControlNetModel from diffusers.utils import load_image from huggingface_hub import hf_hub_download controlnet = ZImageControlNetModel.from_single_file( hf_hub_download( "alibaba-pai/Z-Image-Turbo-Fun-Controlnet-Union", filename="Z-Image-Turbo-Fun-Controlnet-Union.safetensors", ), torch_dtype=torch.bfloat16, ) pipe...
We can remove `control_noise_refiner.` from 2.0 state_dict and model as they are unused to save some vram, this did require changing `_should_convert_state_dict_to_diffusers` though as `set(model_state_dict.keys())` is a subset of `set(checkpoint_state_dict.keys())`,...
Thanks @elismasilva @iwr-redmond, I saw that, we will have to see what happens, for now 2.0 works - if another version is released I'll update this PR or make another....
[24f454c](https://github.com/huggingface/diffusers/pull/12792/commits/24f454ced2ca7366146d269b7ff0fd2bb03f744a) I've changed `add_control_noise_refiner` from `bool` to `Literal["control_layers", "control_noise_refiner"]` where `control_layers` is `2.0` and `control_noise_refiner` is `2.1`. Keys in the weights are the same between 2.0 and 2.1, we have...
@elismasilva I don't think there's any need for them to make changes to their own repo, at most we should make a PR with the config only to their Hub...
Prompt: `一位年轻女子站在阳光明媚的海岸线上,白裙在轻拂的海风中微微飘动。她拥有一头鲜艳的紫色长发,在风中轻盈舞动,发间系着一个精致的黑色蝴蝶结,与身后柔和的蔚蓝天空形成鲜明对比。她面容清秀,眉目精致,透着一股甜美的青春气息;神情柔和,略带羞涩,目光静静地凝望着远方的地平线,双手自然交叠于身前,仿佛沉浸在思绪之中。在她身后,是辽阔无垠、波光粼粼的大海,阳光洒在海面上,映出温暖的金色光晕。` [Control image](https://huggingface.co/alibaba-pai/Z-Image-Turbo-Fun-Controlnet-Union-2.0/resolve/main/asset/pose.jpg?download=true) Note: Parameters (and seed) match original code, but the same prompt is used here, in the original code examples the prompt is changed between versions. 1.0:...
This is incorrect. Minimal reproduction ```python import math import torch import torch.nn as nn import torch.nn.functional as F def get_freqs(dim, max_period=10000.0): freqs = torch.exp( -math.log(max_period) * torch.arange(start=0, end=dim, dtype=torch.float32) /...
@ddpasa `encoder.embed_tokens.weight` is probably the only affected key, see: https://github.com/huggingface/transformers/blob/6db4332171df2b4099c44c7a5c01258b91f7394a/src/transformers/models/t5/modeling_t5.py#L1181-L1186 This should be a fairly simple fix to `convert_sd3_t5_checkpoint_to_diffusers`, something like: ```diff diff --git a/src/diffusers/loaders/single_file_utils.py b/src/diffusers/loaders/single_file_utils.py index d4676ba25..2678976e9 100644 ---...