dumpmemory

Results 51 comments of dumpmemory

@markusdr u might use following ```python optimizer = torch.optim.AdamW(model.parameters(), lr=lr) ```

> did u facing gpu memory increase with zero 3 setting ? I find the reason, we can disable the zero init option to fix the gpu memory increasing with...

I have set scaled_dot_product_attention as default when the torch 2.0 was installed. It should be as efficient as original.

> I tested this PR with Torch 2.0 on my 4x40GB A100, but found that it is 2x slower than the original flash attention implementation. I haven't dug into the...

> haotian-liu Hi, my deepspeed config is just accelerate config with zero 2 setting and cpu offload and bf16 enabled. I will upload later. deepspeed.json ``` { "train_batch_size": "auto", "train_micro_batch_size_per_gpu":...

> Hello @dumpmemory, great work getting this issue solved from DeepSpeed and raising the fix here. Could you apply the fix to all places in lora and adalora wherein F.linear...

@pacman100 pls help me to check it. i have made all F.linear replaced.

> Thank you @dumpmemory for iterating, LGTM! 🤗 > > Could you run `make style` and `make quality` to fix the quality issues? yes, i will. I will also test...