sd-scripts icon indicating copy to clipboard operation
sd-scripts copied to clipboard

support HunYuan DiT

Open KohakuBlueleaf opened this issue 8 months ago • 19 comments

[WIP] This PR is a draft PR for contributors to check the progress and review the codes.

This PR starts with a simple implementation by me for minimal inference and some modifications:

  1. modify the initialize method of HunYuanDiT to avoid the requirements of argparse
  2. replace the flash_atth with pytorch sdp and xformers implementation.
  3. implement the gradient checkpointing mechanism to save TONS OF VRAM.
  4. support "CLIP concat" trick for long prompt.
    • Need review by HunYuan team. Should work as original one with max_length_clip=77
  5. a test script for quick check on inference.
    • I didn't follow the style of xxx_minimal_inference. So I called it hunyuan_test.py. But it can be seen as a minimal inference script

Notes about loading model

The directory structure I used is:

model/
  clip/
  denoiser/
  mt5/
  vae/

basically download files from the t2i folder of HunYuanDiT

and put the content of clip_text_encoder and tokenizer into clip. put mt5 into mt5, put model into denoiser, put sdxl-vae-fp16-fix into vae

This spec can be changed if needed.

TODO List

  • [ ] Examine the current implementation on modified part
  • [ ] Bundle format support (if possible)
  • [x] training utils (if needed)
    • [x] *Tokenizer/TE related for Dataset
  • [ ] training script (modify from sdxl_train.py)
  • [ ] *lora/lycoris training script (modify from sdxl_train_network.py)
    • [x] Initial Support.
    • [ ] Unique training arg region.
    • [ ] Implementation check.
  • [ ] *lora module supports
    • [ ] kohya lora
    • [x] LyCORIS

Low Priority TODO List

  • [ ] cache TE embeddings.

Notification to contributors

  • You can assume the create_network method from imported network module will work correctly.
    • Kohya and I will ensure that.
  • Check sdxl_train.py and sdxl_train_network.py and the dataset things carefully before starting development. It is very likely that we only need few modification to make things work. Try to avoid any "fully rework".
  • If you want to contribute to this PR, open another PR into this branch: https://github.com/KohakuBlueleaf/sd-scripts/tree/HunYuanDiT
    • I will check all the related PR/issue frequently in this week

KohakuBlueleaf avatar Jun 21 '24 10:06 KohakuBlueleaf