Jiarui Fang（方佳瑞） comments

Results 220 comments of


                                            Jiarui Fang（方佳瑞）

[FEATURE]: support for gradient clipping by value

Can you post more information about the PyTorch clip_grad_value? How to use it? Maybe some code snippet is more helpful.

[hotfix] Fix typos for supporting distributed training

@kurisusnowdeng @haofanwang Can we close this PR? It looks like the problem has been solved.

[BUG]: RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same

I guess the input should be torch.half? ColossalAI strategy convert model parameter to torch.half automatically.

[BUG]: Maybe it is a bug in LowLevelZeroOptimizer

@yhcc Thanks for your help. We have not tested hybrid parallel with low level zero and tensor parallel yet. Could you please do some tests and give a pull request...

[BUG]: diffusion docker problem

`->` /data/scratch/diffuser/laion_part0

[FEATURE]: Integration to diffusers framework

Thanks for your attention to our project. Integration to diffuser is in our schedule. We are planning to add ColossalAI to dreambooth, then we will consider working on the other...

[BUG]: Diffusion -- "NotImplementedError: Some torch function is incompatible because of its complcated inputs"

@qq110146 @flymin @flymin Hi all, sorry for the bug. I believe @1SAA has fixed it.

[FEATURE]: Any plan to support train_dreambooth_colossalai with train_text_encoder?

You are right. The GeminiDPP can only deal with parameters from a single model, rather than parameters from multiple models. But, you can support train_text_encoder parameter with some code modifications.

[FEATURE]: Any plan to support train_dreambooth_colossalai with train_text_encoder?

We are considering making Gemini more flexible, but we have no such plan in the next two weeks.

[FEATURE]: Any plan to support train_dreambooth_colossalai with train_text_encoder?

We have a very intensive discussion on this. Can could please report the latest conclusion @1SAA , @ver217 .