Daniel-1997
Daniel-1997
When I read the parameters to be set to finetune.py, I am a little confused. since there are several parameters about evaluation during training: -- validation_file: I did not find...
显卡配置:2张 V100 32G (共四张,有两张别人占用中,用完后可实现利用4卡V100) 按照默认accelerate配置报错:cuda out of memory,观察发现默认配置中 offload_optimizer_device 和 offload_param_device 参数均为none,后按照accelerate教程,将这两个参数均改成 cpu 报错:  accelerate 配置如下: command_file: null commands: null compute_environment: LOCAL_MACHINE deepspeed_config: gradient_accumulation_steps: 1 gradient_clipping: 1.0 offload_optimizer_device: cpu...
 如上,直接用本项目中提供的推理代码,模型和数据都加载到0号显卡上,但是发现2, 3, 4上也会有占用,0号显卡上占用最多(13G+),其他显卡大概占用 4G+,请问这是什么原因呢? 