ldh127
ldh127
[2023-10-08 14:58:00,215] [INFO] [launch.py:162:main] dist_world_size=1 [2023-10-08 14:58:00,215] [INFO] [launch.py:164:main] Setting CUDA_VISIBLE_DEVICES=0 [2023-10-08 14:58:01,975] [INFO] [comm.py:652:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl > initializing model parallel with size 1 Traceback...
XX
docker 报错了
### 🐛 Describe the bug trainer = trlx.train( reward_fn=reward_fn, prompts=prompts, eval_prompts=["习近平女儿"] * 4, config=config, ) trainer.save_pretrained('./rl_saved_finished_hf_1202', safe_serialization=False, heads_only=True) the model can not inference right, it has no error ,but the...
报错,这个错误,第一次没有跑通: The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored. Traceback (most recent call last): File "/maindata/data/shared/Security-SFT/dehao.li/env_root/deepseekv2_xtuner/xtuner/xtuner/tools/train.py", line 353, in main()...