quarkus icon indicating copy to clipboard operation
quarkus copied to clipboard

Memory leak when using FT Fallback with dependent beans

Open manofthepeace opened this issue 1 year ago • 9 comments

Describe the bug

Not exactly sure if its with dependent beans per se, but we saw this in our application using quartz job. The reproducer uses something a bit similar using quartz as well.

When we create a job that uses or a fallback method, or a FallbackHandler class, the Job remains as an outgoing ref in io.quarkus.arc.impl.EagerInstanceHandle

So by running the reproducer we create 1000 jobs that runs instantly (should not exists after) here's the result

WITH FALLBACK Screen Shot 2024-02-16 at 1 37 29 PM

Screen Shot 2024-02-16 at 1 35 03 PM

WITHOUT FALLBACK (comment line 18 of MyJob.java in the reproducer)

Screen Shot 2024-02-16 at 1 43 01 PM

Screen Shot 2024-02-16 at 1 44 13 PM

Expected behavior

Memory should not climb up without getting down each time a job run. GC should be able to clean it up

Actual behavior

Memory remains taken forever

How to Reproduce?

Reproducer: https://github.com/manofthepeace/ft-leak

Steps to reproduce: 1- mvn quarkus:dev (wait a couple sec until the 1000 job runs) 2- get a heapdump via jps+ jmap -dump:live,file=/Users/alex/Desktop/dumps/dump.hprof <pid> 3- analyze EagerInstanceHandle and MyJob.*

4- comment out line 18 (the fallback annotation) from MyJob.java 5- repeat step 1,2,3

Output of uname -a or ver

Darwin Kernel Version 21.6.0

Output of java -version

Temurin-21.0.2+13

Quarkus version or git rev

3.7.3

Build tool (ie. output of mvnw --version or gradlew --version)

apache-maven-3.9.5

Additional information

Got around the leak by changing the fallback to a CDI decorator

manofthepeace avatar Feb 16 '24 18:02 manofthepeace

Got around the leak by changing the fallback to a CDI decorator

@manofthepeace what do you mean by this? How does a decorator workaround this?

Also, did you look who holds the reference to the io.quarkus.arc.impl.EagerInstanceHandle? The handle is likely just a mean to perform programmatic lookup which can, under certain cirmustances, lead to a leak with dependent beans.

manovotn avatar Feb 19 '24 00:02 manovotn

/cc @Ladicek (fault-tolerance), @machi1990 (quartz), @mkouba (quartz,scheduler)

quarkus-bot[bot] avatar Feb 19 '24 00:02 quarkus-bot[bot]

I'll assign to myself, unless someone beats me to it, I will take a closer look tomorrow.

manovotn avatar Feb 19 '24 00:02 manovotn

@manofthepeace what do you mean by this? How does a decorator workaround this?

In my particular case I could replace the FallbackHandler class by a decorator to achieve the behaviour I wanted for my application. By using the decorator I do not see the leak anymore I can run my reproducer when a scheduled of like 1000 invocation every 10 seconds and the memory is stable, contrary to doing the same with the Fallback method (or FallbackHandler class) where it just climbs up.

Also, did you look who holds the reference to the io.quarkus.arc.impl.EagerInstanceHandle?

I did not but if you want me to check anything I can, but likely not tomorrow.

manofthepeace avatar Feb 19 '24 00:02 manofthepeace

Hm, I don't think this is related to SR FT Fallback but more likely to the Quartz integration in Quarkus, more specifically to the JobFactory and the way a new job is created: https://github.com/quarkusio/quarkus/blob/main/extensions/quartz/runtime/src/main/java/io/quarkus/quartz/runtime/QuartzSchedulerImpl.java#L1235-L1253. This factory is using programmatic lookup and so the @Dependent beans are kept in memory due to how jakarta.enterprise.inject.Instance works.

mkouba avatar Feb 19 '24 08:02 mkouba

I had to debug the bean discovery process to figure out why does MyJob even become a bean -- it turns out we treat fault tolerance annotations as bean defining annotations.

@mkouba so I take it you're fixing this? :-)

Ladicek avatar Feb 19 '24 10:02 Ladicek

I had to debug the bean discovery process to figure out why does MyJob even become a bean -- it turns out we treat fault tolerance annotations as bean defining annotations.

It would be a bean simply because it implements Job, see QuartzProcessor.

manovotn avatar Feb 19 '24 10:02 manovotn

Even better :-)

Ladicek avatar Feb 19 '24 11:02 Ladicek

Actually it isn't simply because it implements Job -- it implements Job and requires container services (in this case, declares an injection point). I had no idea we have this, nice!

Ladicek avatar Feb 19 '24 11:02 Ladicek

Hm, I don't think this is related to SR FT Fallback but more likely to the Quartz integration in Quarkus, more specifically to the JobFactory and the way a new job is created: https://github.com/quarkusio/quarkus/blob/main/extensions/quartz/runtime/src/main/java/io/quarkus/quartz/runtime/QuartzSchedulerImpl.java#L1235-L1253. This factory is using programmatic lookup and so the @Dependent beans are kept in memory due to how jakarta.enterprise.inject.Instance works.

This is indeed the case. First thing coming to mind is using a CDI-aware wrapper working with Instance.Handle, something like this - https://github.com/quarkusio/quarkus/compare/main...manovotn:quarkus:issue38824?expand=1 But I don't know much about Quartz so @mkouba might have a smarter idea :)

manovotn avatar Feb 19 '24 13:02 manovotn

Hm, I don't think this is related to SR FT Fallback but more likely to the Quartz integration in Quarkus, more specifically to the JobFactory and the way a new job is created: https://github.com/quarkusio/quarkus/blob/main/extensions/quartz/runtime/src/main/java/io/quarkus/quartz/runtime/QuartzSchedulerImpl.java#L1235-L1253. This factory is using programmatic lookup and so the @Dependent beans are kept in memory due to how jakarta.enterprise.inject.Instance works.

This is indeed the case. First thing coming to mind is using a CDI-aware wrapper working with Instance.Handle, something like this - https://github.com/quarkusio/quarkus/compare/main...manovotn:quarkus:issue38824?expand=1 But I don't know much about Quartz so @mkouba might have a smarter idea :)

This would not be a correct solution. A @Dependent bean that serves as a Quartz job should not be automatically destroyed after Job#execute() completed.

The problem in this particular setup is that the user creates a bunch of "one-off" jobs, i.e. jobs that are only executed once, not periodically based on interval/cron (which is IMO a more common use).

In order to fix this correctly, we would need to hook into Quartz and attempt to destroy the job instance once it's executed and (probably) removed from the scheduler. I don't know if such an API exists though. Maybe org.quartz.TriggerListener.triggerComplete() and check for org.quartz.Trigger.CompletedExecutionInstruction.DELETE_TRIGGER. I need to take a look.

mkouba avatar Feb 19 '24 16:02 mkouba

The problem in this particular setup is that the user creates a bunch of "one-off" jobs, i.e. jobs that are only executed once, not periodically based on interval/cron (which is IMO a more common use).

Ah, so for jobs that run periodically, you expect to reuse the @Dependent bean instance? I was assuming you'd want a new one as well (as per nature of this scope).

manovotn avatar Feb 19 '24 16:02 manovotn

Ah, so for jobs that run periodically, you expect to reuse the @Dependent bean instance?

Exactly.

I was assuming you'd want a new one as well (as per nature of this scope).

This is not how org.quartz.spi.JobFactory works. In other words, in Quartz a job is not created for each job execution. @Scheduled methods in Quarkus work a bit differently.

mkouba avatar Feb 19 '24 16:02 mkouba

I was assuming you'd want a new one as well (as per nature of this scope).

This is not how org.quartz.spi.JobFactory works. In other words, in Quartz a job is not created for each job execution. @Scheduled methods in Quarkus work a bit differently.

Ok, I see, I misunderstood how it's supposed to behave. I can see one way to access the Trigger is from JobExecutionContext and from there you can do getTrigger().mayFireAgain() - that might be one way to check for when to destroy the instance.

That being said, any repeating triggers will still hang in there and the only way to properly destroy their dependent scoped jobs would be during QuartzSchedulerImpl#destroy.

manovotn avatar Feb 19 '24 17:02 manovotn

I was assuming you'd want a new one as well (as per nature of this scope).

This is not how org.quartz.spi.JobFactory works. In other words, in Quartz a job is not created for each job execution. @Scheduled methods in Quarkus work a bit differently.

Ok, I see, I misunderstood how it's supposed to behave.

@manovotn Actually, it seems that it was me who misunderstood how it's supposed to work and the Job instance is created for single execution only ;-). Which means that you wrapper should work, except that we should not call destroy() unless the bean is @Dependent.

mkouba avatar Feb 19 '24 19:02 mkouba