Avoid recreating EquivalenceKeys for usual suspects
Also avoid using Objects.hash() in these critical objects as it is allocating an array. It is especially useless when computing only one element but even if not it's worth avoiding the allocations. FWIW, we got rid of it in HV a long time ago as it proved to be harmful.
@Ladicek this is to start the discussion for now.
One question I had: when creating the index, do we reuse the same instance of DotName for String, Long... And if so, I was wondering if we could reference them somehow as I would expect equals() to be faster when most of the time we have the exact same instance (except if crafted manually and not coming from the index).
Yeah, in the persistent index, all DotNames are componentized and shared, to save as much memory as possible.
Also, this looks good, but as you mentioned it's just to start the discussion, I'm not merging it yet. When you feel like this is mergeable, let me know.
Yeah, in the persistent index, all DotNames are componentized and shared, to save as much memory as possible.
@Ladicek I was wondering if there could be a way to make sure we use the instance from DotName for these particular types.
@Ladicek I pushed an additional commit to cache all the ClassTypeEquivalenceKey. But I'm not sure if we should keep the cache around or if it should be tied to the overlay lifecycle.
Ideas welcome :)
I guess it should be possible to have a cache for equivalence keys in the annotation overlay. I don't think it's a good idea to keep one in the ClassEquivalenceKey itself; it probably isn't a big deal, but feels wrong. Let me think about it.
it probably isn't a big deal, but feels wrong. Let me think about it
Yeah, I agree. I'm not a big fan of static CHM looking forward to a memory leak...
Now, I'm not sure it's worth caching all the equivalence keys. In my flame graph, I have only see the primitive one and the class type one.
I'm pretty sure there's a relatively small number of class types that are used most often and a long tail of class types that are used relatively infrequently. An actual cache with the size of, say, 100 or 1000 elements should cover it well, I think.
On the huge code base I generated with mostly entities, repositories and REST resources, I have for ClassTypeEquivalenceKey only:
- 1,447,429 instances of
ClassTypeEquivalenceKeycreated - 20,996 unique elements
- The 30 most often created ones are representing 2/3 (for instance
336261 CLASS: java.lang.Long) - But even less present classes such as resource classes can be present 40+ times as we create a new one equivalence key for each method
Keep in mind that the code is generated so mostly identical stuff. But I'm pretty sure in a real life code base, you also have a lot of the same classes used as parameters and return types.
Also one thing that is IMHO important: in the case of ClassTypeEquivalenceKey, the key is extremely simple and optimized, it's just the class name.
I'm not sure this optimization would be as beneficial for cache entries for which the key is a lot more complex.
Note that we still have the option of only using the first commit, which might be a good compromise.
I've submitted #540, which is roughly the 1st commit + the 2nd commit limited to java.* types. This feels like a reasonable first step -- we can intern java.* class types safely, because there's a bounded number of them.
I'm not going to close this, because there's more work to do for sure.