bingnandu

Results 3 comments of bingnandu

I have also encountered this problem. May I ask if this problem has been resolved and how it was resolved

During the pre-training of a GPT model, if the --profile flag is enabled and the exit-interval is greater than the profile-step-start, the application would abnormally terminate with a -6 exit...

我也碰到了相同的问题,安装过libaio-dev,cuda没有问题,仍然报错, 我的xtuner版本是 root@dbn-test-m-0:/data/dubingnan/dbn-ceph# xtuner version 11/12 14:07:27 - mmengine - INFO - 0.1.23 我运行的是xtuner train llama3_8b_instruct_qlora_alpaca_e3,报错如下 root@dbn-test-m-0:/data/dubingnan/dbn-ceph/xtuner# xtuner train llama3_8b_instruct_qlora_alpaca_e3 [2024-11-12 13:41:23,345] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)...