irmin
irmin copied to clipboard
irmin-pack: get better insight into memory usage
Currently we are trying to use a combination of Memtrace
and maxrss
to track the memory throughout irmin-pack
, however this does not actually give us very good insight into memory usage.
-
maxrss
will always return the maximum resident memory for the process, meaning this will always go up but never down -
Memtrace
, especially when used intezos-node
, provides output that is hard to decipher due to the high amount of unrelated allocations
I think it makes sense to continue working with Memtrace
while looking at options other than rusage
/maxrss
for logging and generating graphs. I will make sure to post any progress in this issue.
Another nice usage of Memtrace
is to use it in scripts to filter allocations based on which functions made them: maybe plot allocations from index
vs the overall allocations?
(there is an example in their blog post and another here which I used to compare two traces)
Using Gc
I was able to get some information about allocations over time (compared to here maxrss
in orange). I was interested in understanding how this upper index log_size (in Irmin_pack_layers.Repo.unsafe_v_upper
) affects memory/performance. I'm still trying to figure out why the maximum
allocations lags behind the current
allocations at certain points.
master branch
61440 commits completed in 540.92s.
[0.009s per commit, 114 commits per second]
master branch with increased upper index log size
61440 commits completed in 443.92s.
[0.007s per commit, 138 commits per second]
samoht/cancel branch
61440 commits completed in 550.17s.
[0.009s per commit, 112 commits per second]
samoht/cancel branch with increased upper index log size
61440 commits completed in 423.28s.
[0.007s per commit, 145 commits per second]
Using this Memtrace
filter I was looking at irmin
allocations compared to index
allocations. It looks like index
is allocating significantly more than irmin
- is this expected? If not, it's very possible there is an error in the Memtrace
filter.
master
master with increased upper index log size
samoht/cancel
samoht/cancel with increased upper index log size
My thought after looking at these is that we definitely need to provide additional parameters that can be used to tune memory usage in irmin-pack
, there is no way we can find an optimal value for every possible use case.
Are there any other known internal parameters that could be exposed to the user but aren't? Or any there a reason why we don't want to take that approach?
To provide an update here - the Memtrace filter used above wasn't taking into account the calls from Irmin_pack
to functions in Index_unix.Raw
. The fixed graph for master looks like this: