zhangvia

Results 54 comments of zhangvia

> The device map where you can specify which device should get what ratios for splitting. > > There is no one single answer to the other question, as it...

> V100 32G能不能训起来? deepspeed可以降低stage1到33g左右,两卡。如果用更多的卡可以继续降一点。可以考虑用colossalai来卸载大部分参数和优化器状态到cpu上。但我没有尝试成功。。

> > > V100 32G能不能训起来? > > > > > > deepspeed可以降低stage1到33g左右,两卡。如果用更多的卡可以继续降一点。可以考虑用colossalai来卸载大部分参数和优化器状态到cpu上。但我没有尝试成功。。 > > 您好。请问您可以提供将stage1降到两卡33G的 deepspeed版的代码吗?非常感谢 可以参考摩尔线程的训练代码,通过accelerate启动deepspeed

> hello,我用摩尔线程的训练代码,deepspeed训练4步之后会卡住不动。请问您是用他们的代码+deepspeed成功了吗?可以介绍一下是怎么用的吗? 可能是种子问题吧,seed必须是摩尔线程代码的seed

> you can use deepspeed to reduce gpu memory i did, but the process will stuck at backward in 5th step. i was confused

阿里才是丢国人的脸,不开源就不开源,也没人来喷你。说了开源又不公开代码,这叫骗子

> Is this still a problem? given that the device-map=auto can only place the model to different gpus according to the model size. i add AlignDevicesHook to every model manually....

```python if is_model_cpu_offload: _pipeline.enable_model_cpu_offload() elif is_sequential_cpu_offload: _pipeline.enable_sequential_cpu_offload() ``` it just call the enable_model_cpu_offload() or enable_sequential_cpu_offload(). actually i never call the enable_model_cpu_offload() or enable_sequential_cpu_offload(), my AlignDevicesHook are added to model manually...

> Thing is we are not supposed to call any offloading related utilities manually when any component underlying a pipeline was initialized with "balanced" device_map. i agree with that. but...