JFDuan
Results
2
issues of
JFDuan
This PR is for accelerating LLaMA model weights loading with safetensors. I find current load weight implementation doubles the time cost as the tensor-model parallelism increases (refer to the belowing...
I follow the instruction in your CNN benchamrk training resnet50 with sync data. After I exec `train.sh`, It failed with the following information. Can you offer some help? ``` ------------------------------------------------------------------...