sd-scripts
sd-scripts copied to clipboard
Given groups=1, weight of size [1536, 16, 2, 2], expected input[4, 4, 128, 96] to have 16 channels, but got 4 channels instead
Loading settings from /content/fine_tune/config/config_file.toml...
/content/fine_tune/config/config_file
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the legacy
(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False
. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Training with captions.
loading existing metadata: /content/fine_tune/meta_lat.json
using bucket info in metadata / メタデータ内のbucket情報を使います
[Dataset 0]
batch_size: 4
resolution: (1024, 1024)
enable_bucket: True
network_multiplier: 1.0
min_bucket_reso: None
max_bucket_reso: None
bucket_reso_steps: None
bucket_no_upscale: None
[Subset 0 of Dataset 0] image_dir: "/content/fine_tune/train_data" image_count: 30 num_repeats: 20 shuffle_caption: False keep_tokens: 0 keep_tokens_separator: caption_separator: , secondary_separator: None enable_wildcard: False caption_dropout_rate: 0.0 caption_dropout_every_n_epoches: 0 caption_tag_dropout_rate: 0.0 caption_prefix: None caption_suffix: None color_aug: False flip_aug: False face_crop_aug_range: None random_crop: False token_warmup_min: 1, token_warmup_step: 0, alpha_mask: False, metadata_file: /content/fine_tune/meta_lat.json
[Dataset 0]
loading image sizes.
100% 30/30 [00:00<00:00, 691368.79it/s]
make buckets
number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む)
bucket 0: resolution (512, 1024), count: 60
bucket 1: resolution (576, 1024), count: 280
bucket 2: resolution (704, 1024), count: 40
bucket 3: resolution (768, 1024), count: 100
bucket 4: resolution (832, 1024), count: 40
bucket 5: resolution (1024, 704), count: 20
bucket 6: resolution (1024, 768), count: 20
bucket 7: resolution (1024, 1024), count: 40
mean ar error (without repeats): 0.0
prepare accelerator
accelerator device: cuda
Loading SD3 models from /content/pretrained_model/sd3_medium.safetensors
loading model for process 0/1
Building VAE
Loading state dict...
Loaded VAE: <All keys matched successfully>
[Dataset 0]
caching latents.
checking cache validity...
100% 30/30 [00:00<00:00, 554313.30it/s]
caching latents...
0it [00:00, ?it/s]
loading model for process 0/1
Loading clip_l from /content/pretrained_model/clip_l.safetensors...
Building ClipL
Loading state dict...
Loaded ClipL: <All keys matched successfully>
loading model for process 0/1
Loading clip_g from /content/pretrained_model/clip_g.safetensors...
Building ClipG
Loading state dict...
Loaded ClipG: <All keys matched successfully>
loading model for process 0/1
Loading t5xxl from /content/pretrained_model/t5xxl_fp16.safetensors...
Building T5XXL
Loading state dict...
Loaded T5XXL: <All keys matched successfully>
[Dataset 0]
caching text encoder outputs.
checking cache existence...
100% 30/30 [00:00<00:00, 134146.18it/s]
caching text encoder outputs...
0it [00:00, ?it/s]
loading model for process 0/1
Building MMDit
Loading state dict...
Loaded MMDiT: <All keys matched successfully>
train mmdit: True
number of models: 1
number of trainable parameters: 2028328000
prepare optimizer, data loader etc.
use Adafactor optimizer | {'scale_parameter': False, 'relative_step': False, 'warmup_init': False}
constant_with_warmup will be good / スケジューラはconstant_with_warmupが良いかもしれません
running training / 学習開始
num examples / サンプル数: 600
num batches per epoch / 1epochのバッチ数: 150
num epochs / epoch数: 53
batch size per device / バッチサイズ: 4
gradient accumulation steps / 勾配を合計するステップ数 = 4
total optimization steps / 学習ステップ数: 2014
steps: 0% 0/2014 [00:00<?, ?it/s]
epoch 1/53
epoch is incremented. current_epoch: 0, epoch: 1
epoch is incremented. current_epoch: 0, epoch: 1
epoch is incremented. current_epoch: 0, epoch: 1
epoch is incremented. current_epoch: 0, epoch: 1
epoch is incremented. current_epoch: 0, epoch: 1
epoch is incremented. current_epoch: 0, epoch: 1
epoch is incremented. current_epoch: 0, epoch: 1
epoch is incremented. current_epoch: 0, epoch: 1
Traceback (most recent call last):
File "/content/kohya-trainer/sd3_train.py", line 974, in