Daniel-1997

Results 3 issues of Daniel-1997

When I read the parameters to be set to finetune.py, I am a little confused. since there are several parameters about evaluation during training: -- validation_file: I did not find...

显卡配置:2张 V100 32G (共四张,有两张别人占用中,用完后可实现利用4卡V100) 按照默认accelerate配置报错:cuda out of memory,观察发现默认配置中 offload_optimizer_device 和 offload_param_device 参数均为none,后按照accelerate教程,将这两个参数均改成 cpu 报错: ![image](https://github.com/OpenLMLab/MOSS/assets/59271872/7d446a9f-b69c-40ad-8cb7-946a58376a00) accelerate 配置如下: command_file: null commands: null compute_environment: LOCAL_MACHINE deepspeed_config: gradient_accumulation_steps: 1 gradient_clipping: 1.0 offload_optimizer_device: cpu...

![image](https://github.com/mymusise/ChatGLM-Tuning/assets/59271872/e0191855-2e55-4cc1-804d-72d6f2eb0628) 如上,直接用本项目中提供的推理代码,模型和数据都加载到0号显卡上,但是发现2, 3, 4上也会有占用,0号显卡上占用最多(13G+),其他显卡大概占用 4G+,请问这是什么原因呢? ![image](https://github.com/mymusise/ChatGLM-Tuning/assets/59271872/53888ff9-ec4c-4c73-820b-0fa3d7394eef)