benchmark redundant memory allocation maybe the root cause of OOMs

redundant memory allocation maybe the root cause of OOMs

Open jinsong-mao opened this issue 1 year ago • 1 comments

Hi @xuzhao9 ,

during the investigation of LLAMA_7b OOM issue, we found that there are many redundant memory allocation. maybe it's not necessary for test. 1, there is deepcopy for maybe_cast() and deepcopy_and_maybe_cast(). which would duplicate the memory on GPU allocated for this model. https://github.com/pytorch/benchmark/blob/main/userbenchmark/dynamo/dynamobench/common.py#L2400 https://github.com/pytorch/benchmark/blob/main/userbenchmark/dynamo/dynamobench/common.py#L2403

looks we need to check more strictly on deepcopy.

2, there is deepcopy in validate_model() too. https://github.com/pytorch/benchmark/blob/main/userbenchmark/dynamo/dynamobench/common.py#L1918

we can run the LLAMA_7b model(which has OOM issue previously https://github.com/pytorch/benchmark/issues/2051 ) with one A100 40G after commenting out the unnecessary deepcopy().

hope this information can help on fixing the OOM issues in this repo.

Thanks

Nov 27 '23 07:11 jinsong-mao

dynamobench is owned by the PT2 team. In my understanding, it is used for accuracy check because some models are stateful. cc @desertfire is there a way to turn off deepcopy in dynamobench?

Nov 27 '23 14:11 xuzhao9

benchmark benchmark copied to clipboard

redundant memory allocation maybe the root cause of OOMs

benchmark
benchmark copied to clipboard