composer
composer copied to clipboard
Improving memory snapshot
trafficstars
What does this PR do?
This PR uses _record_memory_history_impl instead of _record_memory_history_legacy (previous) to capture memory snapshot. See https://github.com/pytorch/pytorch/blob/main/torch/cuda/memory.py#L698-L738. With enabled =all, this captures all (c++ and python) alloc/free events and gives better memory timeline and stack trace information.