antlr4 icon indicating copy to clipboard operation
antlr4 copied to clipboard

Avoid usage of expensive FlexibleHashMap

Open dreis2211 opened this issue 2 years ago • 5 comments

Hi 👋

I was profiling checkstyle the other day and noticed that the usage of PredictionMode.getConflictingAltSubsets is responsible for almost ~13% of all CPU frames inside my async-profiler flamegraphs. image

When zooming in this can be largely traced back to the usage of FlexibleHashMap image

From the comments I couldn't find the exact reason for this Map, but it seems to be used in conjunction with ...EqualityComparator to customize hash-codes.

Imho, the same can be achieved by using a normal HashMap and wrapping the key as proposed in this PR. By doing so and building checkstyle locally, I get the following improvements.

Old

       14,41 real        24,28 user         1,27 sys

New

       12,68 real        20,99 user         1,05 sys

And the flamegraphs look also cleaner image

Let me know if you think this is a viable solution.

Cheers, Christoph

dreis2211 avatar Mar 09 '23 08:03 dreis2211

hi! Sorry but we can't possibly change that critical algorithm at this point. I don't have time to analyze your change, but this is part of the secret sauce where things have to hash to the same set in a very special way. Differences from this approach in the other targets have consistently been misunderstandings and bugs. Thank you for the contribution, but I will have to close.

parrt avatar Mar 09 '23 16:03 parrt

@parrt this PR is not changing the algorithm, it's the same hash coding. Rather it's using a different strategy for providing custom hashCode/equals. I think it's worth considering.

ericvergnaud avatar Mar 09 '23 16:03 ericvergnaud

Ah. Reopening. I didn't have time to look in depth... So I will leave for a different day

parrt avatar Mar 09 '23 16:03 parrt