jandex icon indicating copy to clipboard operation
jandex copied to clipboard

Avoid recreating EquivalenceKeys for usual suspects

Open gsmet opened this issue 6 months ago • 10 comments

Also avoid using Objects.hash() in these critical objects as it is allocating an array. It is especially useless when computing only one element but even if not it's worth avoiding the allocations. FWIW, we got rid of it in HV a long time ago as it proved to be harmful.

@Ladicek this is to start the discussion for now.

One question I had: when creating the index, do we reuse the same instance of DotName for String, Long... And if so, I was wondering if we could reference them somehow as I would expect equals() to be faster when most of the time we have the exact same instance (except if crafted manually and not coming from the index).

gsmet avatar Jun 03 '25 13:06 gsmet

Yeah, in the persistent index, all DotNames are componentized and shared, to save as much memory as possible.

Ladicek avatar Jun 03 '25 13:06 Ladicek

Also, this looks good, but as you mentioned it's just to start the discussion, I'm not merging it yet. When you feel like this is mergeable, let me know.

Ladicek avatar Jun 03 '25 13:06 Ladicek

Yeah, in the persistent index, all DotNames are componentized and shared, to save as much memory as possible.

@Ladicek I was wondering if there could be a way to make sure we use the instance from DotName for these particular types.

gsmet avatar Jun 03 '25 14:06 gsmet

@Ladicek I pushed an additional commit to cache all the ClassTypeEquivalenceKey. But I'm not sure if we should keep the cache around or if it should be tied to the overlay lifecycle.

Ideas welcome :)

gsmet avatar Jun 03 '25 15:06 gsmet

I guess it should be possible to have a cache for equivalence keys in the annotation overlay. I don't think it's a good idea to keep one in the ClassEquivalenceKey itself; it probably isn't a big deal, but feels wrong. Let me think about it.

Ladicek avatar Jun 03 '25 15:06 Ladicek

it probably isn't a big deal, but feels wrong. Let me think about it

Yeah, I agree. I'm not a big fan of static CHM looking forward to a memory leak...

Now, I'm not sure it's worth caching all the equivalence keys. In my flame graph, I have only see the primitive one and the class type one.

gsmet avatar Jun 03 '25 15:06 gsmet

I'm pretty sure there's a relatively small number of class types that are used most often and a long tail of class types that are used relatively infrequently. An actual cache with the size of, say, 100 or 1000 elements should cover it well, I think.

Ladicek avatar Jun 03 '25 15:06 Ladicek

On the huge code base I generated with mostly entities, repositories and REST resources, I have for ClassTypeEquivalenceKey only:

  • 1,447,429 instances of ClassTypeEquivalenceKey created
  • 20,996 unique elements
  • The 30 most often created ones are representing 2/3 (for instance 336261 CLASS: java.lang.Long)
  • But even less present classes such as resource classes can be present 40+ times as we create a new one equivalence key for each method

Keep in mind that the code is generated so mostly identical stuff. But I'm pretty sure in a real life code base, you also have a lot of the same classes used as parameters and return types.

gsmet avatar Jun 03 '25 15:06 gsmet

Also one thing that is IMHO important: in the case of ClassTypeEquivalenceKey, the key is extremely simple and optimized, it's just the class name.

I'm not sure this optimization would be as beneficial for cache entries for which the key is a lot more complex.

gsmet avatar Jun 03 '25 16:06 gsmet

Note that we still have the option of only using the first commit, which might be a good compromise.

gsmet avatar Jun 03 '25 16:06 gsmet

I've submitted #540, which is roughly the 1st commit + the 2nd commit limited to java.* types. This feels like a reasonable first step -- we can intern java.* class types safely, because there's a bounded number of them.

I'm not going to close this, because there's more work to do for sure.

Ladicek avatar Jul 03 '25 14:07 Ladicek