FlameGraphs.jl icon indicating copy to clipboard operation
FlameGraphs.jl copied to clipboard

Differential flamegraphs

Open timholy opened this issue 5 years ago • 1 comments

Some other tools allow one to compare performance before & after a code change: http://www.brendangregg.com/blog/2014-11-09/differential-flame-graphs.html. This might be a worthy addition. The "logic" should go in this package, with visualizations scattered to other packages.

Perhaps not so simple as to be deserving of a "good first issue" label, but presumably this would not be terribly complicated.

timholy avatar Jan 23 '20 03:01 timholy

Hello, I am in need of advice. I get to this issue from https://github.com/davidanthoff/ProfileVega.jl/issues/5 and really liked the idea. Unfortunately I was struggling with this issue with no luck at all.

My setup was the following

function f1(n)
    res = 0
    for _ in 1:n
        res += sum(rand(10_000))
    end
    
    for _ in 1:n
        res -= sum(rand(10_000))
    end
    res
end

function f2(n)
    res = 0
    for _ in 1:n
        res += sum(rand(20_000))
    end
    
    for _ in 1:n
        res -= sum(rand(5_000))
    end
    res
end

f1(1)
f2(1)
Profile.clear()
@profile f1(10000)
baseline = Profile.fetch()

Profile.clear()
@profile f2(10000)
target = Profile.fetch()

Initially I thought, that I can just build dictionary from baseline of the kind sf => length(span), but this idea turns out to be bad, because there are multiple entries with the same sf.

Secondly, I've tried to match nodes between target and baseline on the same level using largest common sequence approach. That was almost good, but then I run into the problem, that different runs may generate different root nodes, for example f1(::Int64) at REPL[20]:4 and f1(::Int64) at REPL[21]:4, so this algorithm breaks at the start.

I was thinking about anonymizing this kind of nodes "f1(::Int64) at REPL[20]:4" => ":4", but it has drawbacks

  1. Even slightest changes to the function (like inserting print in the beginning) will destroy comparison base.
  2. Other tools generate different versions of this nodes, for example jupyter do something like f1(::Int64) at In[20]:4, so all versions should be anonymized and it looks complicated.

In order to solve first problem, I had an idea of look forward, i.e. if REPL node is met, than it should be skipped for one turn, all childs of all REPL nodes of the next level should be concatenated together and LCS procedure should be applied. After all nodes is matched, procedure will go one step backward and match REPL nodes between base and target according to the match of their children.

Problem is, it looks like overly complicated solution and I am afraid that I am missing something simple. Thank you in advance!

Arkoniak avatar Jan 30 '20 10:01 Arkoniak