Hiusam
Hiusam
一样的问题,我用的Railway
> I was able to train using deepspeed on 8 V100 GPUs. Here is the training script and deepseed config file. > > torchrun --nproc_per_node=8 --master_port=9776 train.py --model_name_or_path hf_model/llama-7b --data_path...
Same question. What version of torch?
I train on 8 80G A100 with 2 batchsize per device and 3 epochs, which takes 50min. The last train loss is: ``` { "epoch": 3.0, "step": 1218, "total_flos": 1.2039681644848742e+17,...
> Nope, it works fine, GPU usage was 100% Hi, do you use Pytorch1.13?
Me too. With M1 Pro, python=3.9, and open3d 0.15.1
> It was docker image built from the Dockerfile in project. I have the same problem as you. I am using Docker. Have you fixed the problem afterwards?
Thanks for your reply. Why don't you train an object classifier with all the object point clouds in the training split and use that one for both datasets (NR3D and...