jandex Avoid recreating EquivalenceKeys for usual suspects

Also avoid using Objects.hash() in these critical objects as it is allocating an array. It is especially useless when computing only one element but even if not it's worth avoiding the allocations. FWIW, we got rid of it in HV a long time ago as it proved to be harmful.

@Ladicek this is to start the discussion for now.

One question I had: when creating the index, do we reuse the same instance of DotName for String, Long... And if so, I was wondering if we could reference them somehow as I would expect equals() to be faster when most of the time we have the exact same instance (except if crafted manually and not coming from the index).

Jun 03 '25 13:06 gsmet

Yeah, in the persistent index, all DotNames are componentized and shared, to save as much memory as possible.

Jun 03 '25 13:06 Ladicek

Also, this looks good, but as you mentioned it's just to start the discussion, I'm not merging it yet. When you feel like this is mergeable, let me know.

Jun 03 '25 13:06 Ladicek

Yeah, in the persistent index, all DotNames are componentized and shared, to save as much memory as possible.

@Ladicek I was wondering if there could be a way to make sure we use the instance from DotName for these particular types.

Jun 03 '25 14:06 gsmet

@Ladicek I pushed an additional commit to cache all the ClassTypeEquivalenceKey. But I'm not sure if we should keep the cache around or if it should be tied to the overlay lifecycle.

Ideas welcome :)

Jun 03 '25 15:06 gsmet

I guess it should be possible to have a cache for equivalence keys in the annotation overlay. I don't think it's a good idea to keep one in the ClassEquivalenceKey itself; it probably isn't a big deal, but feels wrong. Let me think about it.

Jun 03 '25 15:06 Ladicek

it probably isn't a big deal, but feels wrong. Let me think about it

Yeah, I agree. I'm not a big fan of static CHM looking forward to a memory leak...

Now, I'm not sure it's worth caching all the equivalence keys. In my flame graph, I have only see the primitive one and the class type one.

Jun 03 '25 15:06 gsmet

I'm pretty sure there's a relatively small number of class types that are used most often and a long tail of class types that are used relatively infrequently. An actual cache with the size of, say, 100 or 1000 elements should cover it well, I think.

Jun 03 '25 15:06 Ladicek

On the huge code base I generated with mostly entities, repositories and REST resources, I have for ClassTypeEquivalenceKey only:

1,447,429 instances of ClassTypeEquivalenceKey created
20,996 unique elements
The 30 most often created ones are representing 2/3 (for instance 336261 CLASS: java.lang.Long)
But even less present classes such as resource classes can be present 40+ times as we create a new one equivalence key for each method

Keep in mind that the code is generated so mostly identical stuff. But I'm pretty sure in a real life code base, you also have a lot of the same classes used as parameters and return types.

Jun 03 '25 15:06 gsmet

Also one thing that is IMHO important: in the case of ClassTypeEquivalenceKey, the key is extremely simple and optimized, it's just the class name.

I'm not sure this optimization would be as beneficial for cache entries for which the key is a lot more complex.

Jun 03 '25 16:06 gsmet

Note that we still have the option of only using the first commit, which might be a good compromise.

Jun 03 '25 16:06 gsmet

I've submitted #540, which is roughly the 1st commit + the 2nd commit limited to java.* types. This feels like a reasonable first step -- we can intern java.* class types safely, because there's a bounded number of them.

I'm not going to close this, because there's more work to do for sure.

Jul 03 '25 14:07 Ladicek