flux support with fp8 freeze model
FLUX1 support
β minimal Flux1 support
based on https://github.com/wkpark/stable-diffusion-webui/pull/3 (misc optimization fixes are excluded for review)
- support FP8 freeze model loading
devices.autocast()revised to suport fp8 freeze model with different storage dtypes (like as fp8 text encoder + fp8 unet + bf16 vae.)- FLUX1 support patches added based on the previous SD3 work done by A1111 - https://github.com/AUTOMATIC1111/stable-diffusion-webui/commits/master/modules/models/sd3
- and comfyui's flux fixes - https://github.com/comfyanonymous/ComfyUI/commits/master/comfy/ldm/flux
Usage
- GPU VRAM requires 13~14GB (with float8 freeze model)
- use
--medvramcmd option - minimal pytorch version v2.1.2
- download FLUX1 FP8 safetensors model file from civitai or original huggingface repo. (normally ~16GB model have fp8 unet + fp8 text encoder + bf16 vae)
- use Euler
- CFG 1 + set
Ignore negative prompt during early sampling= 1 to ignore negative prompt and speed boost - TAESD work // VAE approx not work (there is no VAE approx for flux)
Troubleshooting
- if you got grayed image results - download fp32 VAE from https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/ae.safetensors
ChangeLog
- [x] LoRA support added. (~~09/09~~ 9/20 ai-toolkit lora, 09/19 - black forest lab lora)
- [x] baked VAE supported.
- [x] one time pass lora support to use LoRA with less memory. (Option -> Optimization -> use LoRA without backup weight
- [x] use
empty_likes()in the sd_disable_initialize.py to speed up model loading - [x]
load_state_dict()withassign=Trueoption to reduce RAM usage and first startup time. (see also https://pytorch.org/tutorials/recipes/recipes/module_load_state_dict_tips.html#using-load-state-dict-assign-true ) (9/16) -> partially reverted and applyassign=Truefor some nn layers (9/18) - [x] unet only checkpoint supported. in this case, default clip_l, t5xxl encoder will be loaded, (also need to load vae manually)
- [x] to use lora without memory issue, use
lora_without_backup_weightsoption found at the Optimization settings. - [x] task manager added to make
gc.collect()work as expected, based on webui-forge's work and simplified https://github.com/lllyasviel/stable-diffusion-webui-forge/commit/f06ba8e60b7a6c0a18979b2b7f2bc4aed68715d0 (09/29) - [x] fix speed issue with LoRA. (10/08)
Checklist:
- [x] I have read contributing wiki page
- [x] I have performed a self-review of my own code
- [x] My code follows the style guidelines
- [x] My code passes tests
screenshots
SDXL loras don't work with this patch:
File "/home/rkfg/stable-diffusion-webui/extensions-builtin/Lora/networks.py", line 671, in network_MultiheadAttention_forward
network_apply_weights(self)
File "/home/rkfg/stable-diffusion-webui/extensions-builtin/Lora/networks.py", line 459, in network_apply_weights
network_restore_weights_from_backup(self)
File "/home/rkfg/stable-diffusion-webui/extensions-builtin/Lora/networks.py", line 400, in network_restore_weights_from_backup
if weights_backup is True or weights_backup == (True, True): # fake backup
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
The commit above fixes the error, thanks!
Lora support seems incomplete, I don't know what key format is the current consensus (AFAIK there was at least one from Xlabs and one that ComfyUI understands) but this lora doesn't load: https://civitai.com/models/652699?modelVersionId=791430
File "/home/rkfg/stable-diffusion-webui/extensions-builtin/Lora/networks.py", line 321, in load_networks
net = load_network(name, network_on_disk)
File "/home/rkfg/stable-diffusion-webui/extensions-builtin/Lora/networks.py", line 186, in load_network
key_network_without_network_parts, network_name, network_weight = key_network.rsplit(".", 2)
ValueError: not enough values to unpack (expected 3, got 2)
I tried another lora and it worked: https://civitai.com/models/639937?modelVersionId=810340
Sad that we don't have an "official" lora format but I'd take a community consensus over a centralized decision any day π
Also, while we're at it, I can't even load the model with Flux T5 enabled without --medvram because apparently they're loaded together and 24 Gb isn't enough for that. So I made this little patch for SD3/Flux models specifically (I don't need it for SDXL), I noticed they're the only ones with more than 4 VAE channels so I used it as an indicator. Since 24 Gb is the current consumer-grade limit maybe it makes sense to instead make it a default and allow to disable this behavior explicitly?
commit c24c53097d4f85f565cd409b162f5596d516d69e
Author: rkfg <[email protected]>
Date: Mon Sep 2 18:45:53 2024 +0300
Add --medvram-mdit
diff --git a/modules/cmd_args.py b/modules/cmd_args.py
index 38e8b5ba..8d555f42 100644
--- a/modules/cmd_args.py
+++ b/modules/cmd_args.py
@@ -38,6 +38,7 @@ parser.add_argument("--localizations-dir", type=normalized_filepath, default=os.
parser.add_argument("--allow-code", action='store_true', help="allow custom script execution from webui")
parser.add_argument("--medvram", action='store_true', help="enable stable diffusion model optimizations for sacrificing a little speed for low VRM usage")
parser.add_argument("--medvram-sdxl", action='store_true', help="enable --medvram optimization just for SDXL models")
+parser.add_argument("--medvram-mdit", action='store_true', help="enable --medvram optimization just for MDiT-based models (SD3/Flux)")
parser.add_argument("--lowvram", action='store_true', help="enable stable diffusion model optimizations for sacrificing a lot of speed for very low VRM usage")
parser.add_argument("--lowram", action='store_true', help="load stable diffusion checkpoint weights to VRAM instead of RAM")
parser.add_argument("--always-batch-cond-uncond", action='store_true', help="does not do anything")
diff --git a/modules/lowvram.py b/modules/lowvram.py
index 6728c337..0530c1af 100644
--- a/modules/lowvram.py
+++ b/modules/lowvram.py
@@ -18,7 +18,7 @@ def send_everything_to_cpu():
def is_needed(sd_model):
- return shared.cmd_opts.lowvram or shared.cmd_opts.medvram or shared.cmd_opts.medvram_sdxl and hasattr(sd_model, 'conditioner')
+ return shared.cmd_opts.lowvram or shared.cmd_opts.medvram or shared.cmd_opts.medvram_sdxl and hasattr(sd_model, 'conditioner') or shared.cmd_opts.medvram_mdit and hasattr(sd_model, 'latent_channels') and sd_model.latent_channels > 4
def apply(sd_model):
The commit above fixes the error, thanks!
Lora support seems incomplete, I don't know what key format is the current consensus (AFAIK there was at least one from Xlabs and one that ComfyUI understands) but this lora doesn't load: https://civitai.com/models/652699?modelVersionId=791430
File "/home/rkfg/stable-diffusion-webui/extensions-builtin/Lora/networks.py", line 321, in load_networks net = load_network(name, network_on_disk) File "/home/rkfg/stable-diffusion-webui/extensions-builtin/Lora/networks.py", line 186, in load_network key_network_without_network_parts, network_name, network_weight = key_network.rsplit(".", 2) ValueError: not enough values to unpack (expected 3, got 2)I tried another lora and it worked: https://civitai.com/models/639937?modelVersionId=810340
Sad that we don't have an "official" lora format but I'd take a community consensus over a centralized decision any day π
there are two known LoRAs exist for FLUX currently, ai-toolkit and the other one is the black forest labs. currently only ai-toolkit's lora added. (https://github.com/kohya-ss/sd-scripts/blob/a61cf73a5cb5209c3f4d1a3688dd276a4dfd1ecb/networks/convert_flux_lora.py)
Some SDXL models are broken with this PR applied. For example, https://civitai.com/models/194768?modelVersionId=839642 β results in this:
File "/home/rkfg/stable-diffusion-webui/modules/processing.py", line 1002, in process_images_inner
x_samples_ddim = decode_latent_batch(p.sd_model, samples_ddim, target_device=devices.cpu, check_for_nans=True)
File "/home/rkfg/stable-diffusion-webui/modules/processing.py", line 653, in decode_latent_batch
raise e
File "/home/rkfg/stable-diffusion-webui/modules/processing.py", line 637, in decode_latent_batch
devices.test_for_nans(sample, "vae")
File "/home/rkfg/stable-diffusion-webui/modules/devices.py", line 321, in test_for_nans
raise NansException(message)
modules.devices.NansException: A tensor with NaNs was produced in VAE. Use --disable-nan-check commandline argument to disable this check.
Either there's an exception or a message like this:
======================================================================================ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 20/20 [00:03<00:00, 5.85it/s]
A tensor with all NaNs was produced in VAE.
Web UI will now convert VAE into bfloat16 and retry.
To disable this behavior, disable the 'Automatically convert VAE to bfloat16' setting.
======================================================================================
Either way the output is black. AutismMix and other anime models seem to work though. The models that work print Use VAE dtype torch.bfloat16 on load, those that are broken show torch.float32.
VAE now works fine, thank you! I noticed that T5 isn't actually used for some reason even when it's enabled. I run the UI with --medvram, without it I get OOM. The most obvious indicator is that even short text doesn't come out right. Either it's missing completely or at least one word is wrong. So to debug I added a print to https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/82a973c04367123ae98bd9abdf80d9eda9b910e2/modules/models/sd3/other_impls.py#L509
and it's indeed triggered if T5 is enabled (and not triggered if it's not) but the resulting image is exactly the same. So something must be wrong somewhere else.
Added debug output to FluxCond.forward and it looks like T5 returns just zeroes?..
Okay, a quick&dirty fix would be like this:
diff --git a/modules/models/flux/flux.py b/modules/models/flux/flux.py
index 46fd568a..f1f1cc72 100644
--- a/modules/models/flux/flux.py
+++ b/modules/models/flux/flux.py
@@ -107,7 +107,7 @@ class FluxCond(torch.nn.Module):
with safetensors.safe_open(clip_l_file, framework="pt") as file:
self.clip_l.transformer.load_state_dict(SafetensorsMapping(file), strict=False)
- if self.t5xxl and 'text_encoders.t5xxl.transformer.encoder.block.0.layer.0.SelfAttention.k.weight' not in state_dict:
+ if self.t5xxl:
t5_file = modelloader.load_file_from_url(T5_URL, model_dir=clip_path, file_name="t5xxl_fp16.safetensors")
with safetensors.safe_open(t5_file, framework="pt") as file:
self.t5xxl.transformer.load_state_dict(SafetensorsMapping(file), strict=False)
The dev model I use inlcudes T5 but it's not loaded properly from the model itself I think. Loading it explicitly from a separate file works well and the results are as expected.
Okay, a quick&dirty fix would be like this:
diff --git a/modules/models/flux/flux.py b/modules/models/flux/flux.py index 46fd568a..f1f1cc72 100644 --- a/modules/models/flux/flux.py +++ b/modules/models/flux/flux.py @@ -107,7 +107,7 @@ class FluxCond(torch.nn.Module): with safetensors.safe_open(clip_l_file, framework="pt") as file: self.clip_l.transformer.load_state_dict(SafetensorsMapping(file), strict=False) - if self.t5xxl and 'text_encoders.t5xxl.transformer.encoder.block.0.layer.0.SelfAttention.k.weight' not in state_dict: + if self.t5xxl: t5_file = modelloader.load_file_from_url(T5_URL, model_dir=clip_path, file_name="t5xxl_fp16.safetensors") with safetensors.safe_open(t5_file, framework="pt") as file: self.t5xxl.transformer.load_state_dict(SafetensorsMapping(file), strict=False)The dev model I use inlcudes T5 but it's not loaded properly from the model itself I think. Loading it explicitly from a separate file works well and the results are as expected.
previousely, A1111 tests text_encoders.t5xxl.transformer.encoder.embed_tokens.weight
but my case, flux safetensors does not have text_encoders.t5xxl.transformer.encoder.embed_tokens.weight
the following line have been changed to work with for my t5xxl encoder.
@@ -107,7 +107,7 @@ class FluxCond(torch.nn.Module):
with safetensors.safe_open(clip_l_file, framework="pt") as file:
self.clip_l.transformer.load_state_dict(SafetensorsMapping(file), strict=False)
- if self.t5xxl and 'text_encoders.t5xxl.transformer.encoder.embed_tokens.weight' not in state_dict:
+ if self.t5xxl and 'text_encoders.t5xxl.transformer.encoder.block.0.layer.0.SelfAttention.k.weight' not in state_dict:
t5_file = modelloader.load_file_from_url(T5_URL, model_dir=clip_path, file_name="t5xxl_fp16.safetensors")
with safetensors.safe_open(t5_file, framework="pt") as file:
self.t5xxl.transformer.load_state_dict(SafetensorsMapping(file), strict=False)
The latest commit breaks loading some SDXL models such as autismmix: RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.HalfTensor instead (while checking arguments for embedding)
The latest commit breaks loading some SDXL models such as autismmix:
RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.HalfTensor instead (while checking arguments for embedding)
thanks for your reporting.
and this is a note.
why we need to use fix_positions() when using load_state_dict(..., assign=True)
i_pos_ids = torch.tensor([list(range(77))], dtype=torch.int64) # correct position_ids
f_pos_ids = torch.tensor([list(range(77))], dtype=torch.float16) # some checkpoints with bad position_ids
i_pos_ids.copy_(f_pos_ids) # dtype is int64 - dtype preserved # old method, copy_() will increase ram usage.
load_state_dict(...) with assign=False (by default. old method) model.property.weight.copy_(state_dict[...]) state_dict value. (in this case, original dtype is preserved)
load_state_dict(...., assign=True) will "assign" model.property.weight = nn.Parameter(state_dict[...]) and in this case, dtype could be changed by bad dtype.
see also https://github.com/pytorch/pytorch/blob/main/torch/nn/modules/module.py#L2430-L2441
Getting RuntimeError: "index_select_cuda" not implemented for 'Float8_e4m3fn' when loading Flux-dev now. Torch version is 2.4.0. It loads fine with edf4b9e9.
Getting
RuntimeError: "index_select_cuda" not implemented for 'Float8_e4m3fn'when loading Flux-dev now. Torch version is 2.4.0. It loads fine with edf4b9e.
something strange, if you don't mind let me know your checkpoint,
vae must be one of BF16, F16 or F32.
and full log will be helpful with some initial logs. for example
Loading weights [9baef12772] from F:\webui\webui\stable-diffusion-webui\models\Stable-diffusion\xxxxxxxx.safetensors
Creating model from config: F:\webui\webui\stable-diffusion-webui\configs\flux1-inference.yaml
Detected dtypes: {'model.diffusion_model.': {torch.float8_e4m3fn: 780}, 'text_encoders.': {torch.float16: 198, torch.float8_e4m3fn: 168, torch.float32: 52}, 'vae.': {torch.bfloat16: 244}}
VAE dtype torch.bfloat16 detected.
Detected dtypes: {'model.diffusion_model.': {torch.float8_e4m3fn: 780}, 'text_encoders.': {torch.float16: 198, torch.float8_e4m3fn: 168, torch.float32: 52}, 'vae.': {torch.bfloat16: 244}}
load Unet torch.float8_e4m3fn as is ...
VAE dtype torch.bfloat16 detected. load as is.
or you can dump all keys of checkpoint by following script.
#!/usr/bin/python3
import os
import sys
import json
import time
from collections import defaultdict
def get_safetensors_header(filename):
with open(filename, mode="rb") as file:
metadata_len = file.read(8)
metadata_len = int.from_bytes(metadata_len, "little")
json_start = file.read(2)
if metadata_len > 2 and json_start in (b'{"', b"{'"):
json_data = json_start + file.read(metadata_len-2)
return json.loads(json_data)
# invalid safetensors
return {}
args = sys.argv[1:]
if len(args) >= 1 and os.path.isfile(args[0]):
file = args[0]
res = get_safetensors_header(file)
res.pop("__metadata__", None)
for k in res.keys():
print(k, res[k]['dtype'], res[k]['shape'])
exit(0)
I don't remember where exactly I got this model but it's dated Aug 31, probably one of the first 8 bit quants. Here are the keys:
text_encoders.clip_l.logit_scale F32 []
text_encoders.t5xxl.logit_scale F32 []
vae.decoder.conv_in.bias F32 [512]
vae.decoder.conv_in.weight F32 [512, 16, 3, 3]
vae.decoder.conv_out.bias F32 [3]
vae.decoder.conv_out.weight F32 [3, 128, 3, 3]
vae.decoder.mid.attn_1.k.bias F32 [512]
vae.decoder.mid.attn_1.k.weight F32 [512, 512, 1, 1]
vae.decoder.mid.attn_1.norm.bias F32 [512]
vae.decoder.mid.attn_1.norm.weight F32 [512]
vae.decoder.mid.attn_1.proj_out.bias F32 [512]
vae.decoder.mid.attn_1.proj_out.weight F32 [512, 512, 1, 1]
vae.decoder.mid.attn_1.q.bias F32 [512]
vae.decoder.mid.attn_1.q.weight F32 [512, 512, 1, 1]
vae.decoder.mid.attn_1.v.bias F32 [512]
vae.decoder.mid.attn_1.v.weight F32 [512, 512, 1, 1]
vae.decoder.mid.block_1.conv1.bias F32 [512]
vae.decoder.mid.block_1.conv1.weight F32 [512, 512, 3, 3]
vae.decoder.mid.block_1.conv2.bias F32 [512]
vae.decoder.mid.block_1.conv2.weight F32 [512, 512, 3, 3]
vae.decoder.mid.block_1.norm1.bias F32 [512]
vae.decoder.mid.block_1.norm1.weight F32 [512]
vae.decoder.mid.block_1.norm2.bias F32 [512]
vae.decoder.mid.block_1.norm2.weight F32 [512]
vae.decoder.mid.block_2.conv1.bias F32 [512]
vae.decoder.mid.block_2.conv1.weight F32 [512, 512, 3, 3]
vae.decoder.mid.block_2.conv2.bias F32 [512]
vae.decoder.mid.block_2.conv2.weight F32 [512, 512, 3, 3]
vae.decoder.mid.block_2.norm1.bias F32 [512]
vae.decoder.mid.block_2.norm1.weight F32 [512]
vae.decoder.mid.block_2.norm2.bias F32 [512]
vae.decoder.mid.block_2.norm2.weight F32 [512]
vae.decoder.norm_out.bias F32 [128]
vae.decoder.norm_out.weight F32 [128]
vae.decoder.up.0.block.0.conv1.bias F32 [128]
vae.decoder.up.0.block.0.conv1.weight F32 [128, 256, 3, 3]
vae.decoder.up.0.block.0.conv2.bias F32 [128]
vae.decoder.up.0.block.0.conv2.weight F32 [128, 128, 3, 3]
vae.decoder.up.0.block.0.nin_shortcut.bias F32 [128]
vae.decoder.up.0.block.0.nin_shortcut.weight F32 [128, 256, 1, 1]
vae.decoder.up.0.block.0.norm1.bias F32 [256]
vae.decoder.up.0.block.0.norm1.weight F32 [256]
vae.decoder.up.0.block.0.norm2.bias F32 [128]
vae.decoder.up.0.block.0.norm2.weight F32 [128]
vae.decoder.up.0.block.1.conv1.bias F32 [128]
vae.decoder.up.0.block.1.conv1.weight F32 [128, 128, 3, 3]
vae.decoder.up.0.block.1.conv2.bias F32 [128]
vae.decoder.up.0.block.1.conv2.weight F32 [128, 128, 3, 3]
vae.decoder.up.0.block.1.norm1.bias F32 [128]
vae.decoder.up.0.block.1.norm1.weight F32 [128]
vae.decoder.up.0.block.1.norm2.bias F32 [128]
vae.decoder.up.0.block.1.norm2.weight F32 [128]
vae.decoder.up.0.block.2.conv1.bias F32 [128]
vae.decoder.up.0.block.2.conv1.weight F32 [128, 128, 3, 3]
vae.decoder.up.0.block.2.conv2.bias F32 [128]
vae.decoder.up.0.block.2.conv2.weight F32 [128, 128, 3, 3]
vae.decoder.up.0.block.2.norm1.bias F32 [128]
vae.decoder.up.0.block.2.norm1.weight F32 [128]
vae.decoder.up.0.block.2.norm2.bias F32 [128]
vae.decoder.up.0.block.2.norm2.weight F32 [128]
vae.decoder.up.1.block.0.conv1.bias F32 [256]
vae.decoder.up.1.block.0.conv1.weight F32 [256, 512, 3, 3]
vae.decoder.up.1.block.0.conv2.bias F32 [256]
vae.decoder.up.1.block.0.conv2.weight F32 [256, 256, 3, 3]
vae.decoder.up.1.block.0.nin_shortcut.bias F32 [256]
vae.decoder.up.1.block.0.nin_shortcut.weight F32 [256, 512, 1, 1]
vae.decoder.up.1.block.0.norm1.bias F32 [512]
vae.decoder.up.1.block.0.norm1.weight F32 [512]
vae.decoder.up.1.block.0.norm2.bias F32 [256]
vae.decoder.up.1.block.0.norm2.weight F32 [256]
vae.decoder.up.1.block.1.conv1.bias F32 [256]
vae.decoder.up.1.block.1.conv1.weight F32 [256, 256, 3, 3]
vae.decoder.up.1.block.1.conv2.bias F32 [256]
vae.decoder.up.1.block.1.conv2.weight F32 [256, 256, 3, 3]
vae.decoder.up.1.block.1.norm1.bias F32 [256]
vae.decoder.up.1.block.1.norm1.weight F32 [256]
vae.decoder.up.1.block.1.norm2.bias F32 [256]
vae.decoder.up.1.block.1.norm2.weight F32 [256]
vae.decoder.up.1.block.2.conv1.bias F32 [256]
vae.decoder.up.1.block.2.conv1.weight F32 [256, 256, 3, 3]
vae.decoder.up.1.block.2.conv2.bias F32 [256]
vae.decoder.up.1.block.2.conv2.weight F32 [256, 256, 3, 3]
vae.decoder.up.1.block.2.norm1.bias F32 [256]
vae.decoder.up.1.block.2.norm1.weight F32 [256]
vae.decoder.up.1.block.2.norm2.bias F32 [256]
vae.decoder.up.1.block.2.norm2.weight F32 [256]
vae.decoder.up.1.upsample.conv.bias F32 [256]
vae.decoder.up.1.upsample.conv.weight F32 [256, 256, 3, 3]
vae.decoder.up.2.block.0.conv1.bias F32 [512]
vae.decoder.up.2.block.0.conv1.weight F32 [512, 512, 3, 3]
vae.decoder.up.2.block.0.conv2.bias F32 [512]
vae.decoder.up.2.block.0.conv2.weight F32 [512, 512, 3, 3]
vae.decoder.up.2.block.0.norm1.bias F32 [512]
vae.decoder.up.2.block.0.norm1.weight F32 [512]
vae.decoder.up.2.block.0.norm2.bias F32 [512]
vae.decoder.up.2.block.0.norm2.weight F32 [512]
vae.decoder.up.2.block.1.conv1.bias F32 [512]
vae.decoder.up.2.block.1.conv1.weight F32 [512, 512, 3, 3]
vae.decoder.up.2.block.1.conv2.bias F32 [512]
vae.decoder.up.2.block.1.conv2.weight F32 [512, 512, 3, 3]
vae.decoder.up.2.block.1.norm1.bias F32 [512]
vae.decoder.up.2.block.1.norm1.weight F32 [512]
vae.decoder.up.2.block.1.norm2.bias F32 [512]
vae.decoder.up.2.block.1.norm2.weight F32 [512]
vae.decoder.up.2.block.2.conv1.bias F32 [512]
vae.decoder.up.2.block.2.conv1.weight F32 [512, 512, 3, 3]
vae.decoder.up.2.block.2.conv2.bias F32 [512]
vae.decoder.up.2.block.2.conv2.weight F32 [512, 512, 3, 3]
vae.decoder.up.2.block.2.norm1.bias F32 [512]
vae.decoder.up.2.block.2.norm1.weight F32 [512]
vae.decoder.up.2.block.2.norm2.bias F32 [512]
vae.decoder.up.2.block.2.norm2.weight F32 [512]
vae.decoder.up.2.upsample.conv.bias F32 [512]
vae.decoder.up.2.upsample.conv.weight F32 [512, 512, 3, 3]
vae.decoder.up.3.block.0.conv1.bias F32 [512]
vae.decoder.up.3.block.0.conv1.weight F32 [512, 512, 3, 3]
vae.decoder.up.3.block.0.conv2.bias F32 [512]
vae.decoder.up.3.block.0.conv2.weight F32 [512, 512, 3, 3]
vae.decoder.up.3.block.0.norm1.bias F32 [512]
vae.decoder.up.3.block.0.norm1.weight F32 [512]
vae.decoder.up.3.block.0.norm2.bias F32 [512]
vae.decoder.up.3.block.0.norm2.weight F32 [512]
vae.decoder.up.3.block.1.conv1.bias F32 [512]
vae.decoder.up.3.block.1.conv1.weight F32 [512, 512, 3, 3]
vae.decoder.up.3.block.1.conv2.bias F32 [512]
vae.decoder.up.3.block.1.conv2.weight F32 [512, 512, 3, 3]
vae.decoder.up.3.block.1.norm1.bias F32 [512]
vae.decoder.up.3.block.1.norm1.weight F32 [512]
vae.decoder.up.3.block.1.norm2.bias F32 [512]
vae.decoder.up.3.block.1.norm2.weight F32 [512]
vae.decoder.up.3.block.2.conv1.bias F32 [512]
vae.decoder.up.3.block.2.conv1.weight F32 [512, 512, 3, 3]
vae.decoder.up.3.block.2.conv2.bias F32 [512]
vae.decoder.up.3.block.2.conv2.weight F32 [512, 512, 3, 3]
vae.decoder.up.3.block.2.norm1.bias F32 [512]
vae.decoder.up.3.block.2.norm1.weight F32 [512]
vae.decoder.up.3.block.2.norm2.bias F32 [512]
vae.decoder.up.3.block.2.norm2.weight F32 [512]
vae.decoder.up.3.upsample.conv.bias F32 [512]
vae.decoder.up.3.upsample.conv.weight F32 [512, 512, 3, 3]
vae.encoder.conv_in.bias F32 [128]
vae.encoder.conv_in.weight F32 [128, 3, 3, 3]
vae.encoder.conv_out.bias F32 [32]
vae.encoder.conv_out.weight F32 [32, 512, 3, 3]
vae.encoder.down.0.block.0.conv1.bias F32 [128]
vae.encoder.down.0.block.0.conv1.weight F32 [128, 128, 3, 3]
vae.encoder.down.0.block.0.conv2.bias F32 [128]
vae.encoder.down.0.block.0.conv2.weight F32 [128, 128, 3, 3]
vae.encoder.down.0.block.0.norm1.bias F32 [128]
vae.encoder.down.0.block.0.norm1.weight F32 [128]
vae.encoder.down.0.block.0.norm2.bias F32 [128]
vae.encoder.down.0.block.0.norm2.weight F32 [128]
vae.encoder.down.0.block.1.conv1.bias F32 [128]
vae.encoder.down.0.block.1.conv1.weight F32 [128, 128, 3, 3]
vae.encoder.down.0.block.1.conv2.bias F32 [128]
vae.encoder.down.0.block.1.conv2.weight F32 [128, 128, 3, 3]
vae.encoder.down.0.block.1.norm1.bias F32 [128]
vae.encoder.down.0.block.1.norm1.weight F32 [128]
vae.encoder.down.0.block.1.norm2.bias F32 [128]
vae.encoder.down.0.block.1.norm2.weight F32 [128]
vae.encoder.down.0.downsample.conv.bias F32 [128]
vae.encoder.down.0.downsample.conv.weight F32 [128, 128, 3, 3]
vae.encoder.down.1.block.0.conv1.bias F32 [256]
vae.encoder.down.1.block.0.conv1.weight F32 [256, 128, 3, 3]
vae.encoder.down.1.block.0.conv2.bias F32 [256]
vae.encoder.down.1.block.0.conv2.weight F32 [256, 256, 3, 3]
vae.encoder.down.1.block.0.nin_shortcut.bias F32 [256]
vae.encoder.down.1.block.0.nin_shortcut.weight F32 [256, 128, 1, 1]
vae.encoder.down.1.block.0.norm1.bias F32 [128]
vae.encoder.down.1.block.0.norm1.weight F32 [128]
vae.encoder.down.1.block.0.norm2.bias F32 [256]
vae.encoder.down.1.block.0.norm2.weight F32 [256]
vae.encoder.down.1.block.1.conv1.bias F32 [256]
vae.encoder.down.1.block.1.conv1.weight F32 [256, 256, 3, 3]
vae.encoder.down.1.block.1.conv2.bias F32 [256]
vae.encoder.down.1.block.1.conv2.weight F32 [256, 256, 3, 3]
vae.encoder.down.1.block.1.norm1.bias F32 [256]
vae.encoder.down.1.block.1.norm1.weight F32 [256]
vae.encoder.down.1.block.1.norm2.bias F32 [256]
vae.encoder.down.1.block.1.norm2.weight F32 [256]
vae.encoder.down.1.downsample.conv.bias F32 [256]
vae.encoder.down.1.downsample.conv.weight F32 [256, 256, 3, 3]
vae.encoder.down.2.block.0.conv1.bias F32 [512]
vae.encoder.down.2.block.0.conv1.weight F32 [512, 256, 3, 3]
vae.encoder.down.2.block.0.conv2.bias F32 [512]
vae.encoder.down.2.block.0.conv2.weight F32 [512, 512, 3, 3]
vae.encoder.down.2.block.0.nin_shortcut.bias F32 [512]
vae.encoder.down.2.block.0.nin_shortcut.weight F32 [512, 256, 1, 1]
vae.encoder.down.2.block.0.norm1.bias F32 [256]
vae.encoder.down.2.block.0.norm1.weight F32 [256]
vae.encoder.down.2.block.0.norm2.bias F32 [512]
vae.encoder.down.2.block.0.norm2.weight F32 [512]
vae.encoder.down.2.block.1.conv1.bias F32 [512]
vae.encoder.down.2.block.1.conv1.weight F32 [512, 512, 3, 3]
vae.encoder.down.2.block.1.conv2.bias F32 [512]
vae.encoder.down.2.block.1.conv2.weight F32 [512, 512, 3, 3]
vae.encoder.down.2.block.1.norm1.bias F32 [512]
vae.encoder.down.2.block.1.norm1.weight F32 [512]
vae.encoder.down.2.block.1.norm2.bias F32 [512]
vae.encoder.down.2.block.1.norm2.weight F32 [512]
vae.encoder.down.2.downsample.conv.bias F32 [512]
vae.encoder.down.2.downsample.conv.weight F32 [512, 512, 3, 3]
vae.encoder.down.3.block.0.conv1.bias F32 [512]
vae.encoder.down.3.block.0.conv1.weight F32 [512, 512, 3, 3]
vae.encoder.down.3.block.0.conv2.bias F32 [512]
vae.encoder.down.3.block.0.conv2.weight F32 [512, 512, 3, 3]
vae.encoder.down.3.block.0.norm1.bias F32 [512]
vae.encoder.down.3.block.0.norm1.weight F32 [512]
vae.encoder.down.3.block.0.norm2.bias F32 [512]
vae.encoder.down.3.block.0.norm2.weight F32 [512]
vae.encoder.down.3.block.1.conv1.bias F32 [512]
vae.encoder.down.3.block.1.conv1.weight F32 [512, 512, 3, 3]
vae.encoder.down.3.block.1.conv2.bias F32 [512]
vae.encoder.down.3.block.1.conv2.weight F32 [512, 512, 3, 3]
vae.encoder.down.3.block.1.norm1.bias F32 [512]
vae.encoder.down.3.block.1.norm1.weight F32 [512]
vae.encoder.down.3.block.1.norm2.bias F32 [512]
vae.encoder.down.3.block.1.norm2.weight F32 [512]
vae.encoder.mid.attn_1.k.bias F32 [512]
vae.encoder.mid.attn_1.k.weight F32 [512, 512, 1, 1]
vae.encoder.mid.attn_1.norm.bias F32 [512]
vae.encoder.mid.attn_1.norm.weight F32 [512]
vae.encoder.mid.attn_1.proj_out.bias F32 [512]
vae.encoder.mid.attn_1.proj_out.weight F32 [512, 512, 1, 1]
vae.encoder.mid.attn_1.q.bias F32 [512]
vae.encoder.mid.attn_1.q.weight F32 [512, 512, 1, 1]
vae.encoder.mid.attn_1.v.bias F32 [512]
vae.encoder.mid.attn_1.v.weight F32 [512, 512, 1, 1]
vae.encoder.mid.block_1.conv1.bias F32 [512]
vae.encoder.mid.block_1.conv1.weight F32 [512, 512, 3, 3]
vae.encoder.mid.block_1.conv2.bias F32 [512]
vae.encoder.mid.block_1.conv2.weight F32 [512, 512, 3, 3]
vae.encoder.mid.block_1.norm1.bias F32 [512]
vae.encoder.mid.block_1.norm1.weight F32 [512]
vae.encoder.mid.block_1.norm2.bias F32 [512]
vae.encoder.mid.block_1.norm2.weight F32 [512]
vae.encoder.mid.block_2.conv1.bias F32 [512]
vae.encoder.mid.block_2.conv1.weight F32 [512, 512, 3, 3]
vae.encoder.mid.block_2.conv2.bias F32 [512]
vae.encoder.mid.block_2.conv2.weight F32 [512, 512, 3, 3]
vae.encoder.mid.block_2.norm1.bias F32 [512]
vae.encoder.mid.block_2.norm1.weight F32 [512]
vae.encoder.mid.block_2.norm2.bias F32 [512]
vae.encoder.mid.block_2.norm2.weight F32 [512]
vae.encoder.norm_out.bias F32 [512]
vae.encoder.norm_out.weight F32 [512]
text_encoders.clip_l.transformer.text_model.embeddings.position_embedding.weight F16 [77, 768]
text_encoders.clip_l.transformer.text_model.embeddings.token_embedding.weight F16 [49408, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.layer_norm1.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.layer_norm1.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.layer_norm2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.layer_norm2.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.mlp.fc1.bias F16 [3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.mlp.fc1.weight F16 [3072, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.mlp.fc2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.mlp.fc2.weight F16 [768, 3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.self_attn.k_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.self_attn.k_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.self_attn.out_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.self_attn.out_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.self_attn.q_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.self_attn.q_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.self_attn.v_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.0.self_attn.v_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.layer_norm1.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.layer_norm1.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.layer_norm2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.layer_norm2.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.mlp.fc1.bias F16 [3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.mlp.fc1.weight F16 [3072, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.mlp.fc2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.mlp.fc2.weight F16 [768, 3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.self_attn.k_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.self_attn.k_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.self_attn.out_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.self_attn.out_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.self_attn.q_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.self_attn.q_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.self_attn.v_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.1.self_attn.v_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.layer_norm1.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.layer_norm1.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.layer_norm2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.layer_norm2.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.mlp.fc1.bias F16 [3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.mlp.fc1.weight F16 [3072, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.mlp.fc2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.mlp.fc2.weight F16 [768, 3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.self_attn.k_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.self_attn.k_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.self_attn.out_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.self_attn.out_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.self_attn.q_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.self_attn.q_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.self_attn.v_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.10.self_attn.v_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.layer_norm1.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.layer_norm1.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.layer_norm2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.layer_norm2.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.mlp.fc1.bias F16 [3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.mlp.fc1.weight F16 [3072, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.mlp.fc2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.mlp.fc2.weight F16 [768, 3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.self_attn.k_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.self_attn.k_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.self_attn.out_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.self_attn.out_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.self_attn.q_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.self_attn.q_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.self_attn.v_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.11.self_attn.v_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.layer_norm1.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.layer_norm1.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.layer_norm2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.layer_norm2.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.mlp.fc1.bias F16 [3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.mlp.fc1.weight F16 [3072, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.mlp.fc2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.mlp.fc2.weight F16 [768, 3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.self_attn.k_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.self_attn.k_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.self_attn.out_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.self_attn.out_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.self_attn.q_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.self_attn.q_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.self_attn.v_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.2.self_attn.v_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.layer_norm1.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.layer_norm1.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.layer_norm2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.layer_norm2.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.mlp.fc1.bias F16 [3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.mlp.fc1.weight F16 [3072, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.mlp.fc2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.mlp.fc2.weight F16 [768, 3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.self_attn.k_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.self_attn.k_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.self_attn.out_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.self_attn.out_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.self_attn.q_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.self_attn.q_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.self_attn.v_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.3.self_attn.v_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.layer_norm1.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.layer_norm1.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.layer_norm2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.layer_norm2.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.mlp.fc1.bias F16 [3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.mlp.fc1.weight F16 [3072, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.mlp.fc2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.mlp.fc2.weight F16 [768, 3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.self_attn.k_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.self_attn.k_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.self_attn.out_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.self_attn.out_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.self_attn.q_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.self_attn.q_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.self_attn.v_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.4.self_attn.v_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.layer_norm1.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.layer_norm1.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.layer_norm2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.layer_norm2.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.mlp.fc1.bias F16 [3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.mlp.fc1.weight F16 [3072, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.mlp.fc2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.mlp.fc2.weight F16 [768, 3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.self_attn.k_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.self_attn.k_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.self_attn.out_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.self_attn.out_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.self_attn.q_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.self_attn.q_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.self_attn.v_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.5.self_attn.v_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.layer_norm1.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.layer_norm1.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.layer_norm2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.layer_norm2.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.mlp.fc1.bias F16 [3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.mlp.fc1.weight F16 [3072, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.mlp.fc2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.mlp.fc2.weight F16 [768, 3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.self_attn.k_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.self_attn.k_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.self_attn.out_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.self_attn.out_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.self_attn.q_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.self_attn.q_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.self_attn.v_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.6.self_attn.v_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.layer_norm1.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.layer_norm1.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.layer_norm2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.layer_norm2.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.mlp.fc1.bias F16 [3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.mlp.fc1.weight F16 [3072, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.mlp.fc2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.mlp.fc2.weight F16 [768, 3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.self_attn.k_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.self_attn.k_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.self_attn.out_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.self_attn.out_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.self_attn.q_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.self_attn.q_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.self_attn.v_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.7.self_attn.v_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.layer_norm1.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.layer_norm1.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.layer_norm2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.layer_norm2.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.mlp.fc1.bias F16 [3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.mlp.fc1.weight F16 [3072, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.mlp.fc2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.mlp.fc2.weight F16 [768, 3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.self_attn.k_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.self_attn.k_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.self_attn.out_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.self_attn.out_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.self_attn.q_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.self_attn.q_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.self_attn.v_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.8.self_attn.v_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.layer_norm1.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.layer_norm1.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.layer_norm2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.layer_norm2.weight F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.mlp.fc1.bias F16 [3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.mlp.fc1.weight F16 [3072, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.mlp.fc2.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.mlp.fc2.weight F16 [768, 3072]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.self_attn.k_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.self_attn.k_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.self_attn.out_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.self_attn.out_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.self_attn.q_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.self_attn.q_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.self_attn.v_proj.bias F16 [768]
text_encoders.clip_l.transformer.text_model.encoder.layers.9.self_attn.v_proj.weight F16 [768, 768]
text_encoders.clip_l.transformer.text_model.final_layer_norm.bias F16 [768]
text_encoders.clip_l.transformer.text_model.final_layer_norm.weight F16 [768]
text_encoders.clip_l.transformer.text_projection.weight F16 [768, 768]
model.diffusion_model.double_blocks.0.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.0.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.0.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.0.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.0.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.0.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.0.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.0.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.0.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.0.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.0.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.0.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.0.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.0.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.0.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.0.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.0.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.0.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.0.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.0.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.0.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.0.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.0.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.0.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.1.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.1.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.1.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.1.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.1.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.1.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.1.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.1.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.1.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.1.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.1.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.1.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.1.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.1.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.1.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.1.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.1.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.1.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.1.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.1.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.1.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.1.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.1.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.1.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.10.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.10.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.10.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.10.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.10.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.10.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.10.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.10.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.10.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.10.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.10.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.10.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.10.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.10.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.10.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.10.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.10.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.10.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.10.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.10.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.10.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.10.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.10.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.10.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.11.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.11.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.11.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.11.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.11.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.11.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.11.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.11.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.11.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.11.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.11.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.11.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.11.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.11.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.11.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.11.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.11.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.11.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.11.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.11.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.11.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.11.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.11.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.11.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.12.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.12.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.12.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.12.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.12.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.12.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.12.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.12.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.12.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.12.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.12.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.12.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.12.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.12.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.12.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.12.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.12.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.12.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.12.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.12.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.12.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.12.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.12.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.12.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.13.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.13.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.13.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.13.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.13.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.13.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.13.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.13.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.13.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.13.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.13.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.13.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.13.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.13.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.13.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.13.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.13.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.13.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.13.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.13.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.13.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.13.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.13.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.13.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.14.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.14.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.14.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.14.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.14.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.14.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.14.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.14.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.14.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.14.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.14.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.14.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.14.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.14.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.14.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.14.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.14.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.14.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.14.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.14.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.14.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.14.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.14.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.14.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.15.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.15.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.15.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.15.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.15.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.15.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.15.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.15.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.15.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.15.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.15.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.15.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.15.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.15.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.15.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.15.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.15.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.15.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.15.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.15.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.15.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.15.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.15.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.15.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.16.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.16.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.16.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.16.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.16.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.16.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.16.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.16.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.16.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.16.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.16.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.16.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.16.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.16.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.16.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.16.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.16.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.16.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.16.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.16.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.16.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.16.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.16.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.16.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.17.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.17.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.17.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.17.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.17.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.17.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.17.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.17.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.17.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.17.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.17.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.17.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.17.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.17.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.17.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.17.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.17.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.17.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.17.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.17.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.17.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.17.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.17.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.17.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.18.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.18.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.18.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.18.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.18.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.18.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.18.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.18.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.18.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.18.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.18.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.18.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.18.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.18.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.18.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.18.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.18.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.18.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.18.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.18.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.18.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.18.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.18.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.18.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.2.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.2.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.2.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.2.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.2.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.2.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.2.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.2.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.2.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.2.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.2.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.2.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.2.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.2.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.2.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.2.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.2.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.2.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.2.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.2.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.2.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.2.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.2.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.2.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.3.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.3.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.3.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.3.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.3.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.3.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.3.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.3.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.3.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.3.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.3.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.3.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.3.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.3.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.3.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.3.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.3.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.3.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.3.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.3.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.3.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.3.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.3.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.3.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.4.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.4.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.4.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.4.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.4.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.4.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.4.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.4.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.4.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.4.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.4.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.4.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.4.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.4.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.4.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.4.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.4.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.4.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.4.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.4.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.4.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.4.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.4.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.4.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.5.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.5.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.5.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.5.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.5.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.5.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.5.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.5.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.5.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.5.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.5.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.5.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.5.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.5.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.5.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.5.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.5.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.5.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.5.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.5.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.5.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.5.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.5.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.5.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.6.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.6.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.6.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.6.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.6.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.6.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.6.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.6.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.6.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.6.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.6.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.6.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.6.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.6.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.6.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.6.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.6.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.6.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.6.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.6.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.6.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.6.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.6.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.6.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.7.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.7.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.7.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.7.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.7.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.7.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.7.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.7.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.7.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.7.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.7.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.7.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.7.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.7.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.7.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.7.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.7.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.7.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.7.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.7.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.7.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.7.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.7.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.7.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.8.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.8.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.8.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.8.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.8.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.8.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.8.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.8.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.8.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.8.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.8.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.8.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.8.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.8.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.8.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.8.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.8.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.8.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.8.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.8.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.8.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.8.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.8.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.8.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.9.img_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.9.img_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.9.img_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.9.img_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.9.img_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.9.img_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.9.img_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.9.img_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.9.img_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.9.img_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.9.img_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.9.img_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.double_blocks.9.txt_attn.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.9.txt_attn.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.double_blocks.9.txt_attn.proj.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.9.txt_attn.proj.weight F8_E4M3 [3072, 3072]
model.diffusion_model.double_blocks.9.txt_attn.qkv.bias F8_E4M3 [9216]
model.diffusion_model.double_blocks.9.txt_attn.qkv.weight F8_E4M3 [9216, 3072]
model.diffusion_model.double_blocks.9.txt_mlp.0.bias F8_E4M3 [12288]
model.diffusion_model.double_blocks.9.txt_mlp.0.weight F8_E4M3 [12288, 3072]
model.diffusion_model.double_blocks.9.txt_mlp.2.bias F8_E4M3 [3072]
model.diffusion_model.double_blocks.9.txt_mlp.2.weight F8_E4M3 [3072, 12288]
model.diffusion_model.double_blocks.9.txt_mod.lin.bias F8_E4M3 [18432]
model.diffusion_model.double_blocks.9.txt_mod.lin.weight F8_E4M3 [18432, 3072]
model.diffusion_model.final_layer.adaLN_modulation.1.bias F8_E4M3 [6144]
model.diffusion_model.final_layer.adaLN_modulation.1.weight F8_E4M3 [6144, 3072]
model.diffusion_model.final_layer.linear.bias F8_E4M3 [64]
model.diffusion_model.final_layer.linear.weight F8_E4M3 [64, 3072]
model.diffusion_model.guidance_in.in_layer.bias F8_E4M3 [3072]
model.diffusion_model.guidance_in.in_layer.weight F8_E4M3 [3072, 256]
model.diffusion_model.guidance_in.out_layer.bias F8_E4M3 [3072]
model.diffusion_model.guidance_in.out_layer.weight F8_E4M3 [3072, 3072]
model.diffusion_model.img_in.bias F8_E4M3 [3072]
model.diffusion_model.img_in.weight F8_E4M3 [3072, 64]
model.diffusion_model.single_blocks.0.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.0.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.0.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.0.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.0.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.0.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.0.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.0.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.1.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.1.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.1.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.1.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.1.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.1.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.1.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.1.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.10.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.10.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.10.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.10.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.10.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.10.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.10.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.10.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.11.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.11.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.11.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.11.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.11.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.11.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.11.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.11.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.12.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.12.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.12.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.12.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.12.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.12.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.12.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.12.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.13.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.13.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.13.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.13.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.13.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.13.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.13.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.13.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.14.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.14.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.14.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.14.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.14.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.14.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.14.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.14.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.15.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.15.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.15.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.15.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.15.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.15.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.15.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.15.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.16.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.16.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.16.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.16.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.16.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.16.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.16.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.16.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.17.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.17.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.17.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.17.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.17.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.17.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.17.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.17.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.18.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.18.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.18.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.18.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.18.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.18.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.18.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.18.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.19.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.19.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.19.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.19.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.19.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.19.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.19.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.19.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.2.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.2.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.2.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.2.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.2.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.2.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.2.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.2.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.20.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.20.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.20.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.20.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.20.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.20.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.20.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.20.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.21.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.21.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.21.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.21.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.21.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.21.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.21.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.21.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.22.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.22.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.22.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.22.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.22.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.22.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.22.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.22.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.23.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.23.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.23.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.23.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.23.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.23.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.23.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.23.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.24.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.24.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.24.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.24.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.24.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.24.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.24.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.24.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.25.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.25.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.25.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.25.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.25.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.25.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.25.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.25.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.26.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.26.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.26.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.26.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.26.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.26.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.26.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.26.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.27.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.27.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.27.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.27.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.27.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.27.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.27.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.27.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.28.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.28.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.28.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.28.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.28.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.28.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.28.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.28.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.29.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.29.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.29.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.29.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.29.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.29.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.29.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.29.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.3.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.3.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.3.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.3.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.3.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.3.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.3.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.3.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.30.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.30.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.30.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.30.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.30.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.30.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.30.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.30.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.31.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.31.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.31.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.31.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.31.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.31.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.31.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.31.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.32.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.32.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.32.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.32.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.32.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.32.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.32.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.32.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.33.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.33.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.33.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.33.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.33.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.33.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.33.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.33.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.34.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.34.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.34.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.34.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.34.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.34.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.34.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.34.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.35.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.35.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.35.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.35.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.35.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.35.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.35.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.35.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.36.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.36.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.36.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.36.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.36.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.36.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.36.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.36.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.37.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.37.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.37.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.37.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.37.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.37.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.37.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.37.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.4.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.4.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.4.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.4.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.4.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.4.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.4.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.4.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.5.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.5.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.5.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.5.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.5.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.5.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.5.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.5.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.6.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.6.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.6.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.6.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.6.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.6.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.6.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.6.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.7.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.7.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.7.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.7.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.7.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.7.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.7.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.7.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.8.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.8.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.8.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.8.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.8.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.8.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.8.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.8.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.9.linear1.bias F8_E4M3 [21504]
model.diffusion_model.single_blocks.9.linear1.weight F8_E4M3 [21504, 3072]
model.diffusion_model.single_blocks.9.linear2.bias F8_E4M3 [3072]
model.diffusion_model.single_blocks.9.linear2.weight F8_E4M3 [3072, 15360]
model.diffusion_model.single_blocks.9.modulation.lin.bias F8_E4M3 [9216]
model.diffusion_model.single_blocks.9.modulation.lin.weight F8_E4M3 [9216, 3072]
model.diffusion_model.single_blocks.9.norm.key_norm.scale F8_E4M3 [128]
model.diffusion_model.single_blocks.9.norm.query_norm.scale F8_E4M3 [128]
model.diffusion_model.time_in.in_layer.bias F8_E4M3 [3072]
model.diffusion_model.time_in.in_layer.weight F8_E4M3 [3072, 256]
model.diffusion_model.time_in.out_layer.bias F8_E4M3 [3072]
model.diffusion_model.time_in.out_layer.weight F8_E4M3 [3072, 3072]
model.diffusion_model.txt_in.bias F8_E4M3 [3072]
model.diffusion_model.txt_in.weight F8_E4M3 [3072, 4096]
model.diffusion_model.vector_in.in_layer.bias F8_E4M3 [3072]
model.diffusion_model.vector_in.in_layer.weight F8_E4M3 [3072, 768]
model.diffusion_model.vector_in.out_layer.bias F8_E4M3 [3072]
model.diffusion_model.vector_in.out_layer.weight F8_E4M3 [3072, 3072]
text_encoders.t5xxl.transformer.encoder.block.0.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.0.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.0.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.0.layer.0.SelfAttention.relative_attention_bias.weight F8_E4M3 [32, 64]
text_encoders.t5xxl.transformer.encoder.block.0.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.0.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.0.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.0.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.0.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.0.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.1.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.1.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.1.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.1.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.1.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.1.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.1.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.1.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.1.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.10.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.10.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.10.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.10.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.10.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.10.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.10.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.10.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.10.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.11.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.11.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.11.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.11.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.11.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.11.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.11.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.11.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.11.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.12.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.12.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.12.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.12.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.12.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.12.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.12.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.12.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.12.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.13.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.13.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.13.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.13.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.13.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.13.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.13.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.13.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.13.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.14.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.14.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.14.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.14.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.14.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.14.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.14.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.14.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.14.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.15.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.15.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.15.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.15.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.15.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.15.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.15.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.15.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.15.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.16.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.16.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.16.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.16.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.16.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.16.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.16.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.16.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.16.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.17.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.17.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.17.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.17.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.17.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.17.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.17.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.17.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.17.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.18.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.18.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.18.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.18.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.18.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.18.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.18.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.18.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.18.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.19.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.19.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.19.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.19.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.19.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.19.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.19.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.19.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.19.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.2.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.2.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.2.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.2.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.2.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.2.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.2.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.2.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.2.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.20.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.20.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.20.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.20.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.20.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.20.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.20.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.20.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.20.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.21.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.21.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.21.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.21.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.21.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.21.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.21.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.21.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.21.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.22.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.22.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.22.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.22.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.22.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.22.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.22.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.22.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.22.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.23.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.23.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.23.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.23.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.23.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.23.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.23.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.23.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.23.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.3.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.3.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.3.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.3.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.3.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.3.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.3.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.3.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.3.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.4.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.4.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.4.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.4.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.4.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.4.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.4.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.4.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.4.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.5.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.5.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.5.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.5.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.5.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.5.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.5.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.5.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.5.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.6.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.6.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.6.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.6.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.6.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.6.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.6.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.6.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.6.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.7.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.7.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.7.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.7.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.7.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.7.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.7.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.7.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.7.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.8.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.8.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.8.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.8.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.8.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.8.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.8.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.8.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.8.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.9.layer.0.SelfAttention.k.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.9.layer.0.SelfAttention.o.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.9.layer.0.SelfAttention.q.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.9.layer.0.SelfAttention.v.weight F8_E4M3 [4096, 4096]
text_encoders.t5xxl.transformer.encoder.block.9.layer.0.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.block.9.layer.1.DenseReluDense.wi_0.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.9.layer.1.DenseReluDense.wi_1.weight F8_E4M3 [10240, 4096]
text_encoders.t5xxl.transformer.encoder.block.9.layer.1.DenseReluDense.wo.weight F8_E4M3 [4096, 10240]
text_encoders.t5xxl.transformer.encoder.block.9.layer.1.layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.encoder.final_layer_norm.weight F8_E4M3 [4096]
text_encoders.t5xxl.transformer.shared.weight F8_E4M3 [32128, 4096]
The log:
Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] 18:05:57 [123/4433]
Version: v1.10.1-32-ga11936e8
Commit hash: a11936e8734aff164f51a45a6f7b418d70be2e55
Installing requirements
Launching Web UI with arguments: --port=7863 --listen --xformers --no-half-vae --enable-insecure-extension-access --api --no-download-sd-model --medvram
2024-09-17 18:05:46.628584: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off erro
rs from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-09-17 18:05:46.655590: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operati
ons.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-09-17 18:05:47.131738: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/mnt/2Tb/sd-local/lib/python3.10/site-packages/fairscale/experimental/nn/offload.py:19: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custo
m_fwd(args..., device_type='cuda')` instead.
return torch.cuda.amp.custom_fwd(orig_func) # type: ignore
/mnt/2Tb/sd-local/lib/python3.10/site-packages/fairscale/experimental/nn/offload.py:30: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custo
m_bwd(args..., device_type='cuda')` instead.
return torch.cuda.amp.custom_bwd(orig_func) # type: ignore
/mnt/2Tb/sd-local/lib/python3.10/site-packages/xformers/ops/fmha/flash.py:211: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use t
hat instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch.
@torch.library.impl_abstract("xformers_flash::flash_fwd")
/mnt/2Tb/sd-local/lib/python3.10/site-packages/xformers/ops/fmha/flash.py:344: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use t
hat instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch.
@torch.library.impl_abstract("xformers_flash::flash_bwd")
Tag Autocomplete: Could not locate model-keyword extension, Lora trigger word completion will be limited to those added through the extra networks menu.
[-] ADetailer initialized. version: 24.1.2, num models: 9
ControlNet preprocessor location: /home/rkfg/stable-diffusion-webui/extensions/sd-webui-controlnet/annotator/downloads
2024-09-17 18:05:52,820 - ControlNet - INFO - ControlNet v1.1.448
Loading weights [8e91b68084] from /home/rkfg/stable-diffusion-webui/models/Stable-diffusion/nvme/a1111/flux/flux1-dev-fp8.safetensors
Creating model from config: /home/rkfg/stable-diffusion-webui/configs/flux1-inference.yaml
Detected dtypes: {'model.diffusion_model.': {torch.float8_e4m3fn: 780}, 'text_encoders.': {torch.float8_e4m3fn: 219, torch.float16: 197, torch.float32: 2}, 'vae.': {torch.float32: 24
4}}
Detected dtypes: {'model.diffusion_model.': {torch.float8_e4m3fn: 780}, 'text_encoders.': {torch.float8_e4m3fn: 219, torch.float16: 197, torch.float32: 2}, 'vae.': {torch.float32: 24
4}}
load Unet torch.float8_e4m3fn as is ...
Use VAE dtype torch.float32
Loading VAE weights from user metadata: /home/rkfg/stable-diffusion-webui/models/VAE/ae.safetensors
Applying attention optimization: xformers... done.
loading stable diffusion model: RuntimeError
Traceback (most recent call last):
File "/mnt/2Tb/sd-local/lib/python3.10/threading.py", line 973, in _bootstrap
self._bootstrap_inner()
File "/mnt/2Tb/sd-local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/mnt/2Tb/sd-local/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/home/rkfg/stable-diffusion-webui/modules/initialize.py", line 149, in load_model
shared.sd_model # noqa: B018
File "/home/rkfg/stable-diffusion-webui/modules/shared_items.py", line 175, in sd_model
return modules.sd_models.model_data.get_sd_model()
File "/home/rkfg/stable-diffusion-webui/modules/sd_models.py", line 850, in get_sd_model
load_model()
File "/home/rkfg/stable-diffusion-webui/modules/sd_models.py", line 1040, in load_model
sd_model.cond_stage_model_empty_prompt = get_empty_cond(sd_model)
File "/home/rkfg/stable-diffusion-webui/modules/sd_models.py", line 885, in get_empty_cond
d = sd_model.get_learned_conditioning([""])
File "/home/rkfg/stable-diffusion-webui/modules/models/flux/flux.py", line 298, in get_learned_conditioning
return self.cond_stage_model(batch)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/rkfg/stable-diffusion-webui/modules/models/flux/flux.py", line 94, in forward
t5_out = self.model_t5(prompts, token_count=l_out.shape[1])
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/rkfg/stable-diffusion-webui/modules/models/sd3/sd3_cond.py", line 152, in forward
t5_out, t5_pooled = self.t5xxl(tokens_batch)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1603, in _call_impl
result = forward_call(*args, **kwargs)
File "/home/rkfg/stable-diffusion-webui/modules/models/sd3/other_impls.py", line 287, in forward
outputs = self.transformer(tokens, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/rkfg/stable-diffusion-webui/modules/models/sd3/other_impls.py", line 517, in forward
return self.encoder(*args, **kwargs)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/rkfg/stable-diffusion-webui/modules/models/sd3/other_impls.py", line 494, in forward
x, past_bias = layer(x, past_bias)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/rkfg/stable-diffusion-webui/modules/models/sd3/other_impls.py", line 477, in forward
x, past_bias = self.layer[0](x, past_bias)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/rkfg/stable-diffusion-webui/modules/models/sd3/other_impls.py", line 464, in forward
output, past_bias = self.SelfAttention(self.layer_norm(x), past_bias=past_bias)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/rkfg/stable-diffusion-webui/modules/models/sd3/other_impls.py", line 446, in forward
past_bias = self.compute_bias(x.shape[1], x.shape[1], x.device)
File "/home/rkfg/stable-diffusion-webui/modules/models/sd3/other_impls.py", line 436, in compute_bias
values = self.relative_attention_bias(relative_position_bucket) # shape (query_length, key_length, num_heads)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 164, in forward
return F.embedding(
File "/mnt/2Tb/sd-local/lib/python3.10/site-packages/torch/nn/functional.py", line 2267, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: "index_select_cuda" not implemented for 'Float8_e4m3fn'
Stable diffusion model failed to load
thank you for your reporting!
I guess ffd15850 fix make some trouble for your environment.
you can revert commit ffd15850 by git revert ffd15850 command and see what you get.
I can reproduce your error with xformers optimization enabled case!
I made a small change that allows loading the Kohya loras but I'm not sure if it's always correct. The converter is much more convoluted but I really don't know much about all the internals that do the actual heavy lifting, I mostly hacked the UI and other utility parts.
diff --git a/extensions-builtin/Lora/networks.py b/extensions-builtin/Lora/networks.py
index dfdf3c7e..77b7f6c0 100644
--- a/extensions-builtin/Lora/networks.py
+++ b/extensions-builtin/Lora/networks.py
@@ -182,7 +182,11 @@ def load_network(name, network_on_disk):
for key_network, weight in sd.items():
- if diffusers_weight_map:
+ if (
+ diffusers_weight_map
+ and not key_network.startswith("lora_unet_double_blocks")
+ and not key_network.startswith("lora_unet_single_blocks")
+ ):
key_network_without_network_parts, network_name, network_weight = key_network.rsplit(".", 2)
network_part = network_name + '.' + network_weight
else:
Interestingly enough, it's now the only lora type that works. The ai-toolkit loras don't apply, I enabled DEBUG log level and it says Network /stablediff-web/models/Lora/uploads/flux/boreal-flux-dev-lora-v04_1000_steps.safetensors didn't match keys: {'transformer.single_transformer_blocks.0.norm.linear.lora_A.weight': 'transformer.single_transformer_blocks.0.norm.linear', 'transformer.single_transformer_b locks.0.norm.linear.lora_B.weight' etc. After that it also reports (multiple times for different layers)
DEBUG [root] Network boreal-flux-lora-v0.4 layer diffusion_model_single_blocks_0_linear1: The size of tensor a (21504) must match the size of tensor b (12288) at non-singleton dimension 0`.
Another lora also has a different error/warning:
DEBUG [root] Network boreal-train-v219-small-dataset-simple-caption layer diffusion_model_double_blocks_0_img_attn_qkv: Promotion for Float8 Types is not supported, attempted to promote Float8_e4m3fn and Half
Not sure if it's something wrong with my particular instance or some other change... It happens without this small patch I posted above, I also have other unrelated patches but they shouldn't really matter.
Are AI Toolkit loras supported at all? There is some effect as the generated images are different with and without them, but not the expected effect, and there are always errors reported in the generation info. For example, the middle finger lora: https://civitai.com/models/667769?modelVersionId=772066 I can't get anything close to the example images. Maybe it's the lora's fault of course, we don't have a baseline to compare. The kohya/black forest loras seem to work fine.
Are AI Toolkit loras supported at all? There is some effect as the generated images are different with and without them, but not the expected effect, and there are always errors reported in the generation info. For example, the middle finger lora: https://civitai.com/models/667769?modelVersionId=772066 I can't get anything close to the example images. Maybe it's the lora's fault of course, we don't have a baseline to compare. The kohya/black forest loras seem to work fine.
Thank you for your reporting! there is a typo in the ai-toolkit lora map. (fixed now)
I think there might be some bugs that prevent the Flux Lora from working properly as expected.
any comments could be helpful for debugging! I appreciate your patience and thanks again.
First of all, many thanks for developing this, it's absolutely a game changer! I'm more than happy to test, report, and help to debug. The first error about not found keys is gone, but the size errors are still there. Another lora to check: https://civitai.com/models/646288?modelVersionId=723012 it's a bit weird, I dumped the keys using your script and there are only attention weights in it. The errors are like this:
2024-09-19 23:37:49 DEBUG [root] Network megumin_v2_18-- layer diffusion_model_single_blocks_0_linear1: The size of tensor a (21504) must match the size of tensor b (9216) at non-singleton dimension 0
2024-09-19 23:37:49 DEBUG [root] Network megumin_v2_18-- layer diffusion_model_single_blocks_1_linear1: The size of tensor a (21504) must match the size of tensor b (9216) at non-singleton dimension 0
2024-09-19 23:37:49 DEBUG [root] Network megumin_v2_18-- layer diffusion_model_single_blocks_2_linear1: The size of tensor a (21504) must match the size of tensor b (9216) at non-singleton dimension 0
2024-09-19 23:37:49 DEBUG [root] Network megumin_v2_18-- layer diffusion_model_single_blocks_3_linear1: The size of tensor a (21504) must match the size of tensor b (9216) at non-singleton dimension 0
2024-09-19 23:37:49 DEBUG [root] Network megumin_v2_18-- layer diffusion_model_single_blocks_4_linear1: The size of tensor a (21504) must match the size of tensor b (9216) at non-singleton dimension 0
2024-09-19 23:37:49 DEBUG [root] Network megumin_v2_18-- layer diffusion_model_single_blocks_5_linear1: The size of tensor a (21504) must match the size of tensor b (9216) at non-singleton dimension 0
2024-09-19 23:37:49 DEBUG [root] Network megumin_v2_18-- layer diffusion_model_single_blocks_6_linear1: The size of tensor a (21504) must match the size of tensor b (9216) at non-singleton dimension 0
2024-09-19 23:37:49 DEBUG [root] Network megumin_v2_18-- layer diffusion_model_single_blocks_7_linear1: The size of tensor a (21504) must match the size of tensor b (9216) at non-singleton dimension 0
2024-09-19 23:37:49 DEBUG [root] Network megumin_v2_18-- layer diffusion_model_single_blocks_8_linear1: The size of tensor a (21504) must match the size of tensor b (9216) at non-singleton dimension 0
2024-09-19 23:37:49 DEBUG [root] Network megumin_v2_18-- layer diffusion_model_single_blocks_9_linear1: The size of tensor a (21504) must match the size of tensor b (9216) at non-singleton dimension 0
2024-09-19 23:37:49 DEBUG [root] Network megumin_v2_18-- layer diffusion_model_single_blocks_10_linear1: The size of tensor a (21504) must match the size of tensor b (9216) at non-singleton dimension 0
The corresponding layers in the lora are of shape [16, 3072] and [3072, 16] (A and B), I suppose that's rank 16 and after multiplication the size would be [3072, 3072], then joining QKV together would give us [9216, 3072]. But the model's single blocks are [21504, 3072] which is 7 such matrices concatenated and not 3.
Thank you, the fix worked and all loras now work fine!
Found a lora that breaks rendering consistently with this patch and works fine without it. https://civitai.com/models/16910/kairunoburogu-style
Generation parameters (it works sometimes with different seeds, sometimes results in NaNs in VAE):
score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up, source_anime BREAK 1girl <lora:kairunoborugu_PDXL:1>
Steps: 30, Sampler: DPM++ 3M SDE, Schedule type: Karras, CFG scale: 7, Seed: 901463836, Size: 832x1280, Model hash: 67ab2fd8ec, Model: ponyDiffusionV6XL_v6StartWithThisOne, RNG: CPU, Lora hashes: "kairunoborugu_PDXL: 39c88d8aa206", Version: v1.10.1-local-1-gc24c5309
With this PR:
Without:
Another reproduction method is to render with this lora, if it works fine, switch to Flux and render something with it, then back to PDXL and render with the lora again. At this point all generations with PDXL break, both with and without the lora.
And this lora seems to always produce NaN's in Unet when the patch is applied but works fine without it.
And this lora seems to always produce NaN's in Unet when the patch is applied but works fine without it.
works just fine for me with no issue. ~~what is your base checkpoint?~~
I've tested with raemoraXL_v20.safetensors [0c78c81ffb] https://civitai.com/models/413979?modelVersionId=690258
Found a lora that breaks rendering consistently with this patch and works fine without it. https://civitai.com/models/16910/kairunoburogu-style
Generation parameters (it works sometimes with different seeds, sometimes results in NaNs in VAE):
score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up, source_anime BREAK 1girl <lora:kairunoborugu_PDXL:1> Steps: 30, Sampler: DPM++ 3M SDE, Schedule type: Karras, CFG scale: 7, Seed: 901463836, Size: 832x1280, Model hash: 67ab2fd8ec, Model: ponyDiffusionV6XL_v6StartWithThisOne, RNG: CPU, Lora hashes: "kairunoborugu_PDXL: 39c88d8aa206", Version: v1.10.1-local-1-gc24c5309[...]
Another reproduction method is to render with this lora, if it works fine, switch to Flux and render something with it, then back to PDXL and render with the lora again. At this point all generations with PDXL break, both with and without the lora.
same here. work just fine for me.
Interesting. I tried with the base PDXL (ponyDiffusionV6XL_v6StartWithThisOne, 67ab2fd8ec) and AutismMix. Here's my config.json, maybe some of the options there affect it: https://gist.github.com/rkfg/ded6f62527c80ede0f4841c3b1d825f4
The launch arguments: --xformers --no-half-vae --api --freeze-settings --listen --medvram-mdit --hide-ui-dir-config --skip-install
I think it's the new setting LoRA without backup weights that affects it. If it's off I can generate without problems, if it's on some loras break like that. After enabling it I restart the program, then the effect is observed.
I think it's the new setting
LoRA without backup weightsthat affects it. If it's off I can generate without problems, if it's on some loras break like that. After enabling it I restart the program, then the effect is observed.
right. I can reproduce same error with ponyDiffusion v6 + with lora without backup weights option.
the last commit have some issue. will try to fix soon.
to use lora_without_backup_weights correctly, we can't use torch.autocast() at all.
so devices.autocast() has been fixed to replace torch.autocast() in this case.
Very nice, thank you! Some early tests show that the issues were fixed, we'll keep trying different models and loras, both for flux and SDXL (and maybe 1.5 too).