Jiarui Fang(方佳瑞)

Results 220 comments of Jiarui Fang(方佳瑞)

Thanks for your feedback, I will have a look.

docker pull thufeifeibear/turbo_transformers_gpu:latest did you try this prebuilt image?

I suppose so. I remember I tested it. If you meet a problem, I can rebuild one.

@marsggbo Thanks for your feedback. Can you use the GeminiDDP instead of `ShardedModelV2 ` in your code. See the lastest ZeRO implementation as follows. https://github.com/hpcaitech/ColossalAI/blob/main/examples/language/gpt/train_gpt_demo.py#L161

@Sakura-gh hello, I reproduce your run script, it is OK. I use my own dataset.json. Can you check if your dataset is set correctly? A simple way is to test...

ZeRO is used in the context of ADAM or 2nd order optimizer. Generally, a DNN using SGD does not have memory shortage issues. We can through an error if the...

I see. We will check it later.

I recommend using an init context to solve the problem rather than changing the `colossal.nn` functionality. ZeRO init context provides an arg as `target_device` to designate the device to init...

Can you provide more information? A code snippet will be more helpful.