kokkos-tools icon indicating copy to clipboard operation
kokkos-tools copied to clipboard

Fix memory HWM to also show how much memory the profiler is taking

Open bathmatt opened this issue 5 years ago • 5 comments

I've been tracking down a memory issue and found that memory usage tool takes most of the memory for a real run. It would be nice to track that and output it, or remove it from the RSS.

bathmatt avatar Jun 21 '19 12:06 bathmatt

Sounds like you have many small allocations? We can look at that.

nmhamster avatar Jun 21 '19 12:06 nmhamster

Yes, I use trilinos. Most are in that lib. Tons of these in tpetra, muelu ifpack

871.121941   0x2aaaf6aef500              8             Host Allocate   DualView::modified_flags
871.121956   0x2aaaf6aef640              8             Host Allocate   DualView::modified_flags
871.121970   0x2aaaf6aef780              8             Host Allocate   DualView::modified_flags
871.121986   0x2aaaf407dbc0             -8             Host DeAllocate DualView::modified_flags
871.121993   0x2aaaf407e3c0             -8             Host DeAllocate DualView::modified_flags
871.122003   0x2aaaf3b52500             -8             Host DeAllocate DualView::modified_flags

In the last 100k lines of the log there are 13k deallocs and 11k of them are dual view allocations, mostly modified flags

[mbetten@serrano-login3 Bdot]$ tail -100000 ./TestResults.CTS1_MemEvent/BDot.Pressure=0.01.mpi_ranks_per_socket=1.nnodes=8.np=288.refine=0.0.use_np=256/ser7-255931.mem_events |grep DeAll | wc
  13118   78920 1255978
[mbetten@serrano-login3 Bdot]$ tail -100000 ./TestResults.CTS1_MemEvent/BDot.Pressure=0.01.mpi_ranks_per_socket=1.nnodes=8.np=288.refine=0.0.use_np=256/ser7-255931.mem_events |grep DeAll | grep DualView |wc
  11331   67998 1081020

bathmatt avatar Jun 21 '19 13:06 bathmatt

And the bulk of what's left is

Host DeAllocate MV::normImpl lcl

bathmatt avatar Jun 21 '19 14:06 bathmatt

Realized that dealloc is the wrong thing to look at since it is at the end of the run, looking at allocation.

[mbetten@serrano-login3 Bdot]$ tail -100000 ./TestResults.CTS1_MemEvent/BDot.Pressure=0.01.mpi_ranks_per_socket=1.nnodes=8.np=288.refine=0.0.use_np=256/ser7-255931.mem_events |grep \ Allo | wc
  10434   62706  990837
[mbetten@serrano-login3 Bdot]$ tail -100000 ./TestResults.CTS1_MemEvent/BDot.Pressure=0.01.mpi_ranks_per_socket=1.nnodes=8.np=288.refine=0.0.use_np=256/ser7-255931.mem_events |grep \ Allo | grep modified_f |wc
   9505   57030  912480
[mbetten@serrano-login3 Bdot]$ tail -100000 ./TestResults.CTS1_MemEvent/BDot.Pressure=0.01.mpi_ranks_per_socket=1.nnodes=8.np=288.refine=0.0.use_np=256/ser7-255931.mem_events |grep \ Allo | grep MV |wc
    826    5074   69856

bathmatt avatar Jun 21 '19 14:06 bathmatt

Possible duplicate of #9.

stanmoore1 avatar Aug 29 '19 18:08 stanmoore1