xuanhua

Results 4 issues of xuanhua

Hi, guys I find there is an assert failure when I train huggingface's lora based model in pipeline style. Here is the whole steps that I created my model: 1)...

**Describe the bug** I have two ubuntu machines, and with 10Gb/s erthnet cable connected and I want to use deepspeed to use these two machines to run a model training...

bug
training

When I tried to install icetk by using `pip install icetk`, I could see icetk's version is 0.0.5. But when I go back to this code repo. I cannot find...

Hi, guys I have a m1-ultra mac studio and a linux box and I want to use both of them for distributed model training. But I found that gloo does...