ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[FEATURE]: Any plan to support train_dreambooth_colossalai with train_text_encoder?

Open vonchenplus opened this issue 1 year ago • 15 comments

Describe the feature

I found that the train_dreambooth_colossalai script you provide no longer supports train_text_encoder parameter, but it's gives much better results especially on faces, so I would like to ask if you have plans to support this parameter?

Thanks again!

vonchenplus avatar Jan 04 '23 06:01 vonchenplus

+1

blx0102 avatar Jan 05 '23 10:01 blx0102

Have you solve this problem?

mingqizhang avatar Jan 09 '23 08:01 mingqizhang

No, ColossalAI's optimizer cannot support multiple network parameters.

vonchenplus avatar Jan 09 '23 08:01 vonchenplus

You are right. The GeminiDPP can only deal with parameters from a single model, rather than parameters from multiple models. But, you can support train_text_encoder parameter with some code modifications.

feifeibear avatar Jan 09 '23 08:01 feifeibear

We are considering making Gemini more flexible, but we have no such plan in the next two weeks.

feifeibear avatar Jan 09 '23 08:01 feifeibear

+1. It would be much more flexible if we can train multi-modules, which is common when finetuning stable diffusion.

haofanwang avatar Jan 17 '23 13:01 haofanwang

@feifeibear any update for this?

haofanwang avatar Feb 03 '23 08:02 haofanwang

hi @feifeibear , thanks for your great job! Would you explain a little bit about which part I should change for supporting text_encoder parameters? For my thinking, I put all models in a single model class, make unet and text encoder tainable together and fix other models. The single model will be the input of GeminiDDP class. Is it correct? Any help would be appreciated! Thank you so much!

hdjsjyl avatar Feb 26 '23 11:02 hdjsjyl

We have a very intensive discussion on this. Can could please report the latest conclusion @1SAA , @ver217 .

feifeibear avatar Feb 26 '23 14:02 feifeibear

Hi @feifeibear , Thanks for your reply.

Hi @1SAA and @ver217 , would you share your latest conclusion here? I am looking forward to your reply.

Thank you so much!

shileims avatar Feb 27 '23 01:02 shileims

Hi @feifeibear , @1SAA , and @ver217 , Any advice would be very appreciated since I am stuck here for a long time. Really looking forward to your reply. Pls feel free to update your solution here, I can validate the idea. Thank you so much!

shileims avatar Feb 28 '23 05:02 shileims

Hi @feifeibear , @1SAA , @ver217 , Any chance to get your discussion here? It will be very helpful. Looking forward to your reply. Thank you so much!

shileims avatar Mar 02 '23 01:03 shileims

Hi @feifeibear , @1SAA , @ver217 , Could you spend a little bit of time describing your discussion here? Because it is beneficial for many users and in many cases. Thank you so much!

shileims avatar Mar 02 '23 22:03 shileims

Hi @feifeibear , @1SAA , @ver217 , Hi authors of Colossalai, would you provide some useful suggestions for training text encoder and unet together? Thanks

shileims avatar Mar 06 '23 21:03 shileims

Hi @feifeibear , @1SAA , @ver217 , Any updates for this question? I am still waiting for your answers. Any answer will be appreciated! Thanks

shileims avatar Mar 15 '23 21:03 shileims

Hi @feifeibear , @1SAA , and @ver217 , Would you share your brainstorm here? It will be very helpful. Thanks

shileims avatar Mar 21 '23 16:03 shileims

@Fazziekey Could you answer this question?

ver217 avatar Mar 22 '23 11:03 ver217

#3146 should be a related issue. And we should solve these together.

JThh avatar Mar 22 '23 12:03 JThh