quarkus icon indicating copy to clipboard operation
quarkus copied to clipboard

Avoid loading the list of ArC removed beans into the heap

Open Sanne opened this issue 3 years ago • 9 comments

Description

When looking at heap dumps of a Quarkus application running in production mode, the retained memory from io.quarkus.arc.impl.ArcContainerImpl#removedBeans stands out as being a little high.

Considering this information is only needed to throw a more helpful error message if there's ever an error caused by a missing bean, it would be nice to load this information lazily.

Implementation ideas

Essentially we think it should be possible to serialize this information during the build, dump it into a resource, and then only load this if ever such an error is triggered.

Sanne avatar Jul 02 '21 12:07 Sanne

/cc @manovotn, @mkouba

quarkus-bot[bot] avatar Jul 02 '21 12:07 quarkus-bot[bot]

P.S. we need to think how this impacts native-image: serialization might be an issue, but also I wouldn't want the classes which represent the "removed types" to still be part of the code which can not be DCE'd because of Serialization needing to being able to load it. Perhaps we need a custom encoding, or just represent them as strings and do some more effort to figure out the matching ones w/o having the actual types loaded.

Sanne avatar Jul 02 '21 12:07 Sanne

For the sake of completeness, it is possible to remove this information via quarkus.arc.detect-unused-false-positives=false.

Perhaps we need a custom encoding, or just represent them as strings and do some more effort to figure out the matching ones w/o having the actual types loaded.

Hm, this might be tough although not impossible. Take for example the parameterized types and the complexity of CDI type-safe resolution rules...

mkouba avatar Jul 02 '21 12:07 mkouba

FTR after we merged the micro optimizations from the https://github.com/quarkusio/quarkus/pull/18567 the resulting set of removed beans will usually take a few tens of kilobytes of heap (8 KB for the hibernate-orm-quickstart) which is probably acceptable in most cases. For others, quarkus.arc.detect-unused-false-positives=false should help.

mkouba avatar Jul 20 '21 08:07 mkouba

Should we close this given ^ ?

geoand avatar Nov 09 '21 08:11 geoand

Well, we can still do better and save few tens of kilobytes of heap ;-). I'd like to keep this issue open but the priority is very low.

mkouba avatar Nov 09 '21 10:11 mkouba

So I managed to load the removed beans lazily but then realized that many extensions unfortunately use the "try-to-get a bean instance programmatically" pattern during initialization (and handling the potential null return value) which effectively means the removed beans are always loaded when the app starts. A typical example is https://github.com/quarkusio/quarkus/blob/main/extensions/resteasy-reactive/quarkus-resteasy-reactive/runtime/src/main/java/io/quarkus/resteasy/reactive/server/runtime/ResteasyReactiveRecorder.java#L152.

We would have to add something like ArcContainer#isBeanResolvable() and rewrite a lot of stuff to make it useful...

mkouba avatar Sep 20 '22 12:09 mkouba

As far as I am concerned, I'd definitely rewrite stuff I have touched in order to take advantage of this.

geoand avatar Sep 20 '22 13:09 geoand

Hm, it seems that we don't log a warning if at least one bean matches. So maybe it would be more meaningful to only emit a warning for CDI.current().select(Foo.class).get() and Arc.container().instance(Foo.class).get(), i.e. for cases where exactly one bean is expected and obtained, and not for CDI.current().select(Foo.class), Arc.container().listAll() etc.

mkouba avatar Sep 20 '22 16:09 mkouba

@mkouba I see you updated the example you mentioned above. But I assume your comment about us needing to rewrite the rest of the instances still applies?

geoand avatar Sep 23 '22 05:09 geoand

@mkouba I see you updated the example you mentioned above. But I assume your comment about us needing to rewrite the rest of the instances still applies?

Well, any occurence of Instance#get() , ArcContainer#instance() and BeanContainer#instance() will trigger loading of the removed beans iff no bean is found. So basically we would need any extension that attempts to obtain a single bean instance that is potentially not available to use something like this: https://github.com/quarkusio/quarkus/blob/main/extensions/resteasy-reactive/quarkus-resteasy-reactive/runtime/src/main/java/io/quarkus/resteasy/reactive/server/runtime/ResteasyReactiveRecorder.java#L152-L153, i.e. test the availability of the bean before trying to obtain the instance.

mkouba avatar Sep 23 '22 06:09 mkouba

Great, thanks

geoand avatar Sep 23 '22 06:09 geoand

Many thanks for fixing this :)

Regarding the loading getting triggered when our extensions aren't careful about it - would there be a way to verify noone is doing that?

Sanne avatar Sep 23 '22 09:09 Sanne

Regarding the loading getting triggered when our extensions aren't careful about it - would there be a way to verify noone is doing that?

No way I'm aware of. Also keep in mind that it's perfectly legal to call those methods above at any time. So from my point of view it's more like a best effort.

mkouba avatar Sep 23 '22 10:09 mkouba

Right, I understand it's "legal" for end users to trigger this; I'm indeed thinking about a best effort tool that we could use to spot obvious misunderstandings leveraging our extensive testsuite.

For example we could have a system property that makes this legal operation throw an exception and set this in our testsuite to spot extensions needing some polishing; also works to spot regressions in the future so we can sleep well without having to review all PRs for this pattern :)

But I'm not sure if that would be practical; I suppose there might be extensions which we don't want to patch for this at this time, so at least we'd nee a flag set globally but also allow a particular module to opt-out. Perhaps that's too much work, but I'm wondering.

Sanne avatar Sep 23 '22 10:09 Sanne

Perhaps that's too much work, but I'm wondering.

Yes, I think that it's too much. Keep in mind that we have quarkus.arc.detect-unused-false-positives=false and then this lazy loading. That's IMO enough to save a dozen kilobytes of heap.

mkouba avatar Sep 23 '22 11:09 mkouba

Seems like we have a lot of Arc.container().instance(...).get()...

geoand avatar Sep 23 '22 14:09 geoand

Seems like we have a lot of Arc.container().instance(...).get()...

But that's ok if we expect the bean to be available. It's suboptimal if we know that a bean might not be available.

mkouba avatar Sep 23 '22 14:09 mkouba

Ah, thanks for the clarification, it makes a big difference.

geoand avatar Sep 23 '22 14:09 geoand