irmin icon indicating copy to clipboard operation
irmin copied to clipboard

irmin-pack: get better insight into memory usage

Open zshipko opened this issue 3 years ago • 3 comments

Currently we are trying to use a combination of Memtrace and maxrss to track the memory throughout irmin-pack, however this does not actually give us very good insight into memory usage.

  • maxrss will always return the maximum resident memory for the process, meaning this will always go up but never down
  • Memtrace, especially when used in tezos-node, provides output that is hard to decipher due to the high amount of unrelated allocations

I think it makes sense to continue working with Memtrace while looking at options other than rusage/maxrss for logging and generating graphs. I will make sure to post any progress in this issue.

zshipko avatar Dec 09 '20 00:12 zshipko

Another nice usage of Memtrace is to use it in scripts to filter allocations based on which functions made them: maybe plot allocations from index vs the overall allocations? (there is an example in their blog post and another here which I used to compare two traces)

icristescu avatar Dec 09 '20 10:12 icristescu

Using Gc I was able to get some information about allocations over time (compared to here maxrss in orange). I was interested in understanding how this upper index log_size (in Irmin_pack_layers.Repo.unsafe_v_upper) affects memory/performance. I'm still trying to figure out why the maximum allocations lags behind the current allocations at certain points.

master branch 61440 commits completed in 540.92s. [0.009s per commit, 114 commits per second] memory_master


master branch with increased upper index log size 61440 commits completed in 443.92s. [0.007s per commit, 138 commits per second] memory_master_16x_log_size


samoht/cancel branch 61440 commits completed in 550.17s. [0.009s per commit, 112 commits per second] memory_cancel


samoht/cancel branch with increased upper index log size 61440 commits completed in 423.28s. [0.007s per commit, 145 commits per second] memory_cancel_16x_log_size


Using this Memtrace filter I was looking at irmin allocations compared to index allocations. It looks like index is allocating significantly more than irmin - is this expected? If not, it's very possible there is an error in the Memtrace filter.

master irmin_master


master with increased upper index log size irmin_master_16x_log_size


samoht/cancel irmin_cancel


samoht/cancel with increased upper index log size irmin_cancel_16x_log_size


My thought after looking at these is that we definitely need to provide additional parameters that can be used to tune memory usage in irmin-pack, there is no way we can find an optimal value for every possible use case.

Are there any other known internal parameters that could be exposed to the user but aren't? Or any there a reason why we don't want to take that approach?

zshipko avatar Dec 11 '20 03:12 zshipko

To provide an update here - the Memtrace filter used above wasn't taking into account the calls from Irmin_pack to functions in Index_unix.Raw. The fixed graph for master looks like this: fixed-irmin-index-comparison

zshipko avatar Dec 16 '20 00:12 zshipko