sd-scripts Training support for HiDream-I1?

Can this be added for 48gb vram?

Apr 10 '25 10:04 nitinh12

@kohya-ss @rockerBOO Simplertuner has already started implementing this. Please can you add this, too? I am more used to SD scripts.

Apr 12 '25 10:04 nitinh12

+1 All other trainers have it as well

Apr 18 '25 00:04 EClipXAi

@rockerBOO @kohya-ss AI toolkit also implemented this.

Apr 18 '25 09:04 nitinh12

Thank you for your suggestion. HiDream-I1 is a very interesting model. It's good that other trainers have already implemented it, so we can refer to them.

However, we would also like to support FramePack in Musubi Tuner. We will consider the priority.

Apr 18 '25 13:04 kohya-ss

Thank you for your suggestion. HiDream-I1 is a very interesting model. It's good that other trainers have already implemented it, so we can refer to them.

However, we would also like to support FramePack in Musubi Tuner. We will consider the priority.

I would prioritize Hi Dream.

Apr 18 '25 13:04 nitinh12

Thank you for your suggestion. HiDream-I1 is a very interesting model. It's good that other trainers have already implemented it, so we can refer to them.

However, we would also like to support FramePack in Musubi Tuner. We will consider the priority.

@kohya-ss got 6 thumbs up, will you prioritize this now?

Apr 22 '25 08:04 nitinh12

I was excited for Framepack, but it turned out to be a lot of hype tbh. It's promising for the future once they train a WAN version, but since it's a HY model, the i2v consistency is as terrible as we're used to. It takes much longer to generate even a 5S video. Apparently the consistency claims were overblown too. It can only generate 52 frames (2 seconds) of consistency because it doesn't see past that. So if you just want someone to dance on the spot for longer than 10s, it can manage it much better than cutting and stitching end frames together, but it's not going to generate 1 minute short films or anything with consistent characters across scenes. The thing it's good at is avoiding accumulation errors over time. It's also already been superseded by MAGI-1, which at least theoretically does what Framepack promised to do (real world tests on their site didn't bare any resemblance to the demos, but maybe they only offer the 4.5b version for the free tier). But MAGI-1 is way outta reach for training anyway. It's open source, but even inference requires 8xH100s

Apr 23 '25 03:04 Tophness

also i think for hidream, t5 training is not worth it, probably even clip as well, seems that llama does the heavy lifting.

so it'll be nice we we can train the text encoders for hidream, maybe just the clip and llama one.

https://github.com/tdrussell/diffusion-pipe

also seems diffusion pipe has training working with 24gb vram

looking forward to when you'll work on hidream :)

Apr 27 '25 01:04 EClipXAi

I've almost finished the work related to FramePack, so I'd like to start working on sd-scripts issues and PRs, as well as HiDream-I1.

I can't promise when that will be, though. Thank you for your understanding.

Apr 27 '25 02:04 kohya-ss

I've almost finished the work related to FramePack, so I'd like to start working on sd-scripts issues and PRs, as well as HiDream-I1.

I can't promise when that will be, though. Thank you for your understanding.

Any update? Have you started working on this? I am looking forward to this.

May 06 '25 12:05 nitinh12

I've almost finished the work related to FramePack, so I'd like to start working on sd-scripts issues and PRs, as well as HiDream-I1.

I can't promise when that will be, though. Thank you for your understanding.

I'm very curious about how your hidream training script is going, and I can't wait to try it out.

May 09 '25 01:05 fengchunlvdragonplus

any news?? I'm very excited to test it on kohya_ss!!!😎

May 17 '25 04:05 dsienra

@kohya-ss Please add this. I see you're very active with musubi tunner, but please don't forget us

May 20 '25 11:05 nitinh12

I'm sorry for the delay. I'll try to find some time to work on Lumina and HiDream.

May 20 '25 11:05 kohya-ss

I'm sorry for the delay. I'll try to find some time to work on Lumina and HiDream.

can you do chroma too? please

May 21 '25 18:05 nanaj96

I'm sorry for the delay. I'll try to find some time to work on Lumina and HiDream.

@kohya-ss simpletuner caches the text encoder and VAE outputs first and then unloads them from the GPU, which saves a lot of VRAM, and we can train without quantizing. Would love to see a similar implementation in Kohya

May 30 '25 13:05 nitinh12

I'm sorry for the delay. I'll try to find some time to work on Lumina and HiDream.

I'm gently and politely wondering if we can expect hi-dream addition to the kohya family?

Jun 28 '25 13:06 foggyghost0