BenchmarkTools.jl
BenchmarkTools.jl copied to clipboard
Dict -> OrderedDict in BenchmarkGroup?
I don't know if anyone else feels the same, but it is a little bit annoying to me the fact that when I dispay the results of some benchmarks, they are not shown in insertion order. As an example, here for me it is visually difficult to compare the performances of different graph types in Erdos because of the lack of ordering:
julia> @show res["generators"];
res["generators"] = 16-element BenchmarkTools.BenchmarkGroup:
tags: []
("rrg","Net(500, 750) with [] graph, [] vertex, [] edge properties.") => Trial(632.408 μs)
("rrg","Graph{Int64}(100, 150)") => Trial(121.221 μs)
("rrg","Net(100, 150) with [] graph, [] vertex, [] edge properties.") => Trial(115.033 μs)
("rrg","Graph{Int64}(500, 750)") => Trial(677.647 μs)
("complete","Net(100, 4950) with [] graph, [] vertex, [] edge properties.") => Trial(896.223 μs)
("complete","DiGraph{Int64}(100, 9900)") => Trial(617.122 μs)
("complete","DiNet(20, 380) with [] graph, [] vertex, [] edge properties.") => Trial(42.104 μs)
("erdos","Graph{Int64}(500, 1500)") => Trial(405.240 μs)
("erdos","Net(100, 300) with [] graph, [] vertex, [] edge properties.") => Trial(71.516 μs)
("complete","DiGraph{Int64}(20, 380)") => Trial(23.721 μs)
("complete","Net(20, 190) with [] graph, [] vertex, [] edge properties.") => Trial(20.845 μs)
("complete","Graph{Int64}(100, 4950)") => Trial(159.900 μs)
("complete","DiNet(100, 9900) with [] graph, [] vertex, [] edge properties.") => Trial(1.861 ms)
("erdos","Net(500, 1500) with [] graph, [] vertex, [] edge properties.") => Trial(297.167 μs)
("complete","Graph{Int64}(20, 190)") => Trial(7.340 μs)
("erdos","Graph{Int64}(100, 300)") => Trial(88.091 μs)
Would it be reasonable and not too disruptive to use OrderedDict
s instead of Dict
in the BenchmarkGroup
type?
Yes, I could write down some more appropriate comparison methods, but asking doesn't hurt :)
Cheers, Carlo
@shashi has also requested this at one point, and I agree it would be nice.
The main barrier is the need to ensure BenchmarkTools is as forward-compatible as possible for testing performance changes to Julia Base, which means being very careful about what BenchmarkTools takes on as a dependency. If we did end up making this change, rolling our own OrderedDict
might be better than pulling in a larger dependency like DataStructures.
At that point the question becomes: Is ensuring that display order matches insertion order worth the development burden of maintaining an OrderedDict
implementation (or otherwise adopting a new dependency)? Personally, my answer leans towards "no", but if the wider community keeps expressing a desire for this feature then I might be swayed.
In a lot of cases, including the example you gave, it seems like more explicit organization via subgrouping can help solve these kinds of problems. For example, it looks like "erdos"
, "complete"
, and "rrg"
could all be their own subgroups.
In a lot of cases, including the example you gave, it seems like more explicit organization via subgrouping can help solve these kinds of problems. For example, it looks like "erdos", "complete", and "rrg" could all be their own subgroups.
I didn't try again lately, but as far as I can remember having subgroups is not ideal as well, because having subgroups means you have to go one step down in the hierarchy to have the benchmark times displayed. Something can be done in this regard maybe?
I agree that the solutions could be more problematic than this very minor problem, so we can leave this issue open just to pool the interest in its resolution
because having subgroups means you have to go one step down in the hierarchy to have the benchmark times displayed.
The default behavior is to only show the first level, but you can use showall
to display the whole thing.
+1 to this. I am making some benchmarks of different system sizes, and it is a bit annoying to make sense of this:
"EV" => 6-element BenchmarkTools.BenchmarkGroup:
tags: []
"L = 8" => Trial(23.448 μs)
"L = 16" => Trial(46.961 μs)
"L = 10" => Trial(29.194 μs)
"L = 14" => Trial(40.954 μs)
"L = 6" => Trial(17.881 μs)
"L = 12" => Trial(35.076 μs)
I'm finding myself back here with the same wish I had back in September. I believe this would be a really useful feature. Now that julia has reached 1.0, I don't think the maintenance overhead of having an OrderedDict
in this package would be a huge burden. But #60 managed to solve this problem without even introducing it as a package dependency (thanks @shashi), so such drastic measures seem not even necessary. What needs to happen to get this merged (aside from resolve conflicts and update to Pkg3)? Lacking this, even having a recipe that can be used to output all benchmarks in lexicographic order would be most welcome.