MAxx8371

Results 4 comments of MAxx8371

What causes this error?Is the bin_size of a categorical feature bigger than the max_bin that causes the error? Or it is because the memory is not enough. And the model...

全量finetune,ZeRO3,设置output_router_logits=True。训练过程中会突然卡住,GPU利用率突然到100% ![image](https://github.com/QwenLM/Qwen1.5/assets/96909430/096c34cf-fb9c-4e1e-b694-47a5a104d6b9)

> Fixed on master I installed pytorch by running "pip install torch" and you had said "Fixed on master" in github , would you please explain how to update it...

> I have a similar problem. My cluster has a relatively slow shared storage system, so I want to copy dataset to compute node temporary storage system. However, I found...