spring-framework
spring-framework copied to clipboard
Memory leak with CglibAopProxy$ProxyCallbackFilter
It seems that there is a memory leak with Spring Boot 2.4.1 on my Debian 10.5 with Java 11.0.1 when using @PreAuthorize("hasRole('ROLE')")
annotation on controller's endpoint. As you can see in the first image, there is constant grow in old generation memory of G1 garbage collector when running tests. The grow stops at max heap size which is 512 MB. I prepared a sample project where you can reproduce the problem.
Memory usage 1
Memory usage 2
After importing the project just run this command:
./gradlew clean test
Tests will fail because they assume that HTTP response will be 200, but it is 401, but it shouldn't matter.
If you remove @PreAuthorize
from DummyController or prePostEnabled = true
from SecurityConfig then memory usage seems to be ok.
Memory usage 3
Memory usage 4
The project uses dummy controller with one endpoint which has @PreAuthorize("hasRole('ROLE')")
annotation. It has also basic SecurityConfig
to enable @PreAuthorize
annotation. Controller uses some dummy services to make sure that for each test Spring will recreate the context. I also set spring.test.context.cache.maxSize=1
in build.gradle to make sure that the grow of memory does not come from cache.
If you have any more questions about my configuration or anything else, let me know.
Thanks for the sample. I've reproduced the problem. As far as I can tell, it's caused by your configuration of the test context cache size. Limiting it to one means that many different application contexts are created, and for each context a new CGLib proxy of DummyController
is created. This proxy creation creates a new DummyController
subclass that's then pinned in memory along with all its static fields by the app class loader. If I remove the configuration of the spring.test.context.cache.maxSize
system property, the build completes in under 20 seconds with no sign of a memory leak.
You said above that configuring the cache size was to "make sure that the grow of memory does not come from cache". It sounds like you'd identified a leak before making this configuration change but, as far as I can tell, this configuration is the cause of the leak.
Perhaps you can share a sample that reproduces the problem without tuning the test framework's context cache, or perhaps even reproduces it when running your application's main method rather than its tests?
Docs says that spring.test.context.cache.maxSize
has a default value to 32 and it is also the max value for this property. The main project I'm working on is quite large and I guess that I shouldn't public the code from it. I have got there more than 1k tests that uses different context configuration, more than 32, so I suppose that Spring, after running some tests, needs to create new context.
For sample project I set spring.test.context.cache.maxSize
to 1 to easily reproduce the problem. If I set it to 10 or higher then memory usage is fine, but when I set it to 9 or lower then memory problems show up. So I'm guessing that if I prepare code sample that uses more than 32 contexts in tests (the current one uses 10 different contexts) then the problem will come up again.
When I run tests in sample project with spring.test.context.cache.maxSize
as 1 and I remove @PreAuthorize("hasRole('LEAK')")
from DummyControler
then tests are slow, it's ok, Spring needs to recreate context, but memory usage is constant, it does not grow up after time, but when @PreAuthorize("hasRole('LEAK')")
is not removed then memory grows up. When spring.test.context.cache.maxSize
is not set, then tests run quite fast so it's hard to analyze if there is something wrong with memory usage, but again, tests will probably slow down when there will be more than 32 context configurations for tests.
I didn't try to reproduce the problem with application's main method because the problem rather is connected somehow with Spring that fails to clean up old context when tests are running. It's only my guess so I'm not sure what causes it. Maybe it is somehow connected with @PreAuthorize
.
I didn't try to reproduce the problem with application's main method because the problem rather is connected somehow with Spring that fails to clean up old context when tests are running
This would have been useful information in the original description.
Maybe it is somehow connected with
@PreAuthorize
Yes, I believe this is the case. The use of @PreAuthorize
requires DummyController
to be proxied and it's this proxy creation that is causing the memory leak. Each new context creates a new DummyController
sub-class for the CGLib proxy. There's nothing we can do about this in Spring Boot as @PreAuthorize
is a Spring Security feature and CGLib proxy creation is part of Spring Framework.
I think we should get someone from the Framework team to look at this. It may be that something isn't behaving as intended or that could be improved in the proxy creation that would enable some reuse.
This sounds a bit like #14113 which was, once upon a time, solved (accidentally) with #21674.
Any updates here? Some cglib generated proxy classes keep reference to BeanFactory, so BeanFactory can't be garbage collected, and memory footprint grows and grows. It is the most sensitive in SpringBootTests, where a lot of contexts could be created. For example, spring security proxies keep reference to BeanFactory in MethodSecurityMetadataSourceAdvisor.
If the cause of the problem is due to the fact that Spring Security's MethodSecurityMetadataSourceAdvisor
is retaining a reference to the BeanFactory
(which seems plausible based on the current source code), then the solution might be to refactor MethodSecurityMetadataSourceAdvisor
so that it releases that reference when the enclosing ApplicationContext
is closed.
@rwinch & @jhoeller, thoughts?
Thanks for bringing this to my attention @sbrannen.
It does appear that the class generated by CGLIB is (indirectly) holding a reference to MethodSecurityMetadataSourceAdvisor
which in turn holds a reference to the BeanFactory
.
For reference here is a list of objects retained by ProxyCallbackFilter
Here is the shortest GC Root of ProxyCalbackFilter
I don't think Spring Security is the correct place to solve this. Any cleanup that the Advisor
s perform would just be masking the actual leak which is in Spring Framework + CGLIB. Even if the Advisor
cleans up all of its resources, the Advisor
itself is leaked and so are lower level objects (i.e. AdvisedSupport
).
I think it would be ideal if Spring Framework could remove the reference to the ProxyCallbackFilter
on CALLBACK_FILTER
of the generated class once we know that the CALLBACK_FILTER
is no longer needed. It might also mark the generated class as closed
so that if it is used again it reports an error.
Removing the refererence to ProxyCallbackFilter
would ensure that references to AdvisedSupport
are cleaned up along with its methodCache
, all references to its advisors
, and its remaining members. It's important to note that cleaning up references to advisors
will clean up the reference to MethodSecurityMetadataSourceAdvisor
and thus the BeanFactory
it refers to. It will also cleanup references to other Advisor
s and any large objects they might refer to. For example, AbstractBeanFactoryPointcutAdvisor
has it's own BeanFactory
reference.
Obviously this still means the generated classes are still unnecessarily around, but I'm not sure we can drop the references to those given they are held by the jdk.
I wouldn't say that problem is only in MethodSecurityMetadataSourceAdvisor
.
In our application I also see reference from RetryConfiguration
:
And such chain(related to RefreshScope
) keeps reference to GenericWebApplicationContext
.
Obviously this still means the generated classes are still unnecessarily around, but I'm not sure we can drop the references to those given they are held by the jdk.
Would it be possible to generate cglib classes using a classloader that is tied to the ApplicationContext? So when the context gets dropped, the loader is no longer referenced from the GC root and the classes are unloaded.
If I'm not mistaken, the proxy factory already uses the bean factory's classloader, so it would only be a matter of creating a new classloader inside AbstractContextLoader#prepareContext(ConfigurableApplicationContext, MergedContextConfiguration) with ClassUtils.getDefaultClassLoader()
as its parent.
@grubeninspekteur Switching the ClassLoader
used to create the cglib classes will cause ClassCastException
s to be thrown unless the entire application uses the new ClassLoader
which will likely lead to other problems.
Related issues:
- #27375
- #27395
Closing this one based on the assumption that #27375 covers all occurrences of this leak (as a regression in the 5.3.x line). Please give the 5.3.10 release a try, it's available in snapshots already and will be officially released tomorrow!
@jhoeller Unfortunately, it does not appear to have fixed this specific issue. Updating the original sample to use Spring 5.3.10 does not resolve the issue. I've put together a branch named spring-5.3.10 that demonstrates it is not fixed. If you run ./gradlew check
you will see lots of failed tests (expected) and then eventually you will start getting out of memory errors. I haven't looked into if anything has changed from the previous assessment, but the take away at this point is the tests still produce out of memory errors.
Can we please reopen the issue?
Thanks for trying out the sample against 5.3.10 and providing that branch.
We'll look into it.
It seems I confirm leak in branch named spring-5.3.10
MacBook-Pro-Nikita:~ nkonev$ uname -a
Darwin MacBook-Pro-Nikita.local 20.6.0 Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:31 PDT 2021; root:xnu-7195.141.2~5/RELEASE_X86_64 x86_64
I started App in IntelliJ IDEA (only changed repositories in build.gradle to mavenCentral()), then
while(true) do curl -i 'http://localhost:8080'; done
about after 20 minutes I saw
I also took heapdump and uploaded onto my hosting.
I think this can be useful.
Thoughts:
- Are there Tomcat sessions, which can be created per each unauthorized request ?
Update 1:
I opened heapdump
Let's shrink session timeout: After setting
server.servlet.session.timeout=5s
in application.properties the picture after 20 minutes is better
Update 2:
Seems I tested the different case, sorry, hiding this.
The original was about tests ./gradlew check
but I tested like the 'production'.
For the interest - I can't reproduce it against 5.3.10 neither my Mac with ./gradlew check
nor Linux
Thanks for providing additional information @nkonev. I'm curious if you are able to reproduce the issue with the original application that was submitted. Perhaps you have GRADLE_OPTS environment variable specified which is providing a larger amount of memory to the application?
./gradlew check
runs for me but slows down and eventually hangs at around 300 tests ran due to running out of memory.
I haven't looked into why it doesn't put a heap dump after executing, but it is reporting it runs out of memory in the terminal.
For reference my environment is
gw --version
------------------------------------------------------------
Gradle 7.2
------------------------------------------------------------
Build time: 2021-08-17 09:59:03 UTC
Revision: a773786b58bb28710e3dc96c4d1a7063628952ad
Kotlin: 1.5.21
Groovy: 3.0.8
Ant: Apache Ant(TM) version 1.10.9 compiled on September 27 2020
JVM: 11.0.11 (AdoptOpenJDK 11.0.11+9)
OS: Linux 5.4.0-88-generic amd64
I don't have either GRADLE_OPTS, or gradle.properties
MacBook-Pro-Nikita:spring-framework-gh-26266 nkonev$ ls -lah ~/.gradle/
total 0
drwxr-xr-x 11 nkonev staff 352B 8 сен 01:35 .
drwxr-xr-x+ 78 nkonev staff 2,4K 9 окт 05:34 ..
drwxr-xr-x 7 nkonev staff 224B 7 окт 02:45 .tmp
drwxr-xr-x 18 nkonev staff 576B 6 окт 12:41 caches
drwxr-xr-x 16 nkonev staff 512B 6 окт 12:41 daemon
drwxr-xr-x 2 nkonev staff 64B 4 мар 2021 jdks
drwxr-xr-x 2 nkonev staff 64B 9 окт 05:08 kotlin-profile
drwxr-xr-x 13 nkonev staff 416B 6 окт 12:32 native
drwxr-xr-x 9 nkonev staff 288B 8 сен 01:36 notifications
drwxr-xr-x 3 nkonev staff 96B 11 авг 13:29 workers
drwxr-xr-x 3 nkonev staff 96B 18 фев 2021 wrapper
MacBook-Pro-Nikita:spring-framework-gh-26266 nkonev$ echo $GRADLE_OPTS
MacBook-Pro-Nikita:spring-framework-gh-26266 nkonev$
MacBook-Pro-Nikita:spring-framework-gh-26266 nkonev$ echo $JAVA_HOME
MacBook-Pro-Nikita:spring-framework-gh-26266 nkonev$ java -version
openjdk version "11.0.8" 2020-07-14
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.8+10)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.8+10, mixed mode)
MacBook-Pro-Nikita:spring-framework-gh-26266 nkonev$ ./gradlew --version
------------------------------------------------------------
Gradle 7.2
------------------------------------------------------------
Build time: 2021-08-17 09:59:03 UTC
Revision: a773786b58bb28710e3dc96c4d1a7063628952ad
Kotlin: 1.5.21
Groovy: 3.0.8
Ant: Apache Ant(TM) version 1.10.9 compiled on September 27 2020
JVM: 11.0.8 (AdoptOpenJDK 11.0.8+10)
OS: Mac OS X 10.16 x86_64
Yes, I reproduced leak in the original example.
Also on Linux
data:image/s3,"s3://crabby-images/0b7cc/0b7cc41d289d91451a6e0f874b374102018b22c2" alt="Снимок экрана 2021-10-11 в 20 25 39"
But your https://github.com/rwinch/spring-framework-gh-26266/tree/spring-5.3.10 works without leak.
Consider to force-kill gradle daemons, I did it before the test, something like
jps | grep -i gradle | awk '{print $1}' | xargs kill -9
or just
killall -9 java
@nkonev Hmm..Not sure why you are getting a difference in behavior but good to know. Thanks!
I reproduced it by just setting ext['spring-framework.version']='5.3.10'
into build.gradle in the original example.
But previously I didn't reproduce it in your.
It seems there is a dependency on test order, which can be different. I know that it is unspecified by default, but it seems that gradle eventually orders tests by hash of directory content or modification time.
So here is heap dumps here that launch after a minute
May be related with https://github.com/mockito/mockito/issues/2503
with org.springframework:spring-core:5.3.14
a little class like this:
@RestController
@PreAuthorize("hasAuthority('GLOBAL_ADMIN') or hasAuthority('MEMBER')")
public class AccessControlTestController {
@GetMapping("/v1/access_control_test")
String test() {
return "OK";
}
}
seems to cause a nasty leak like this:
data:image/s3,"s3://crabby-images/883a0/883a0f6aea429af025f11b77e296b7e6764e0612" alt="Screenshot 2022-02-18 at 11 32 55 am"
when running integration tests, that leads to OutOfMemoryError
We are having the same issue. We start and close some contexts during the lifetime of our app. If that happens a lot, after a while we get OOM.
After manually nulling all CALLBACK_FILTER references in all proxy classes, the bean factory is finally garbage collected.
I think it would be ideal if Spring Framework could remove the reference to the ProxyCallbackFilter on CALLBACK_FILTERof the generated class once we know that the CALLBACK_FILTER is no longer needed. It might also mark the generated class as closed so that if it is used again it reports an error.
This would be great. I tried to find a way to do it myself, but didn't find any hook that would allow me that. One dirty way was listing all classes from ClassLoader, filtering those containing BySpring
, making the field accessible and setting it to null by reflection.
This however won't work if there is more than one context and I don't know which proxy classes belong to context that I'm closing.
I also tried to collect proxy classes in a custom class loader which I set before the context startup but that didn't really work either. The classes were coming from different class loaders in the hierarchy (some are loaded from parent, cglib proxy classes are magically set to the child loader by reflection.. then they don't match or are not assignable from one to another, and he'll is lose, so this is not a way to go I guess).
I tried to write a class loader to collect created proxy classes and clean the static field after the context is closed. However, our setup of contexts is not trivial and I was running into many class-to-class problems (classes not assignable to each other, etc.).
So I went for the BeanPostProcessor instead, which might not work at all times maybe, but works for us so far.
public class MemoryLeakFixBeanPostProcessor implements BeanPostProcessor, DisposableBean {
/**
* See {@link org.springframework.cglib.proxy.Enhancer}
*/
private static final String CGLIB_PROXY_STATIC_FIELD_WITH_MEMORY_LEAK = "CGLIB$CALLBACK_FILTER";
/**
* See {@link org.springframework.cglib.core.SpringNamingPolicy}
*/
private static final String SPRING_CGLIB_PROXY_CLASS_NAME_TAG = "BySpringCGLIB";
private final Set<Class<?>> classesToCleanUp = new HashSet<>();
@Override
public Object postProcessAfterInitialization(Object bean, @Nonnull String beanName) throws BeansException {
if (bean.getClass().getName().contains(SPRING_CGLIB_PROXY_CLASS_NAME_TAG)) {
classesToCleanUp.add(cls);
}
return bean;
}
@Override
public void destroy() throws Exception {
classesToCleanUp.forEach(cls -> {
try {
var field = cls.getDeclaredField(CGLIB_PROXY_STATIC_FIELD_WITH_MEMORY_LEAK);
field.setAccessible(true);
field.set(null, null);
} catch (Exception e) {
// we can't do anything..
}
});
classesToCleanUp.clear();
}
}
It collects the bean classes if they are Spring proxy classes and clears the static field when the bean post processor is disposed (context is closed).
Thanks for sharing the workaround @redhead!
I have made a pull request for this.
Will this issue also affect the new Spring Releases, e.g. Spring Boot 3.0.x and the then used Spring Framework 6.0.x ?
Same here. with spring-boot 3.0.4 and spring-framework 6.0.6
Running a few integration test with @DirtiesContext and then I get an OOM after a few tests.
In my case it seems to be the @Validated annotation
@Controller
@Validated
@RequiredArgsConstructor
public class WettquoteLigamappingAnlegenServiceGQL {
// ... injected services fields ...
@Transactional
@MutationMapping
public @NotNull WettquoteLigamappingDTO wettquoteLigamappingAnlegen(
@Argument @NotNull @Valid WettquoteLigamappingForm wettquoteLigamappingForm) {
// ....
}
}
After running three simple tests with @DirtiesContext I have3 applicationContexts on the heap. Setting a breakpoint in one test, doing forced GC and getting a heap with YourKit Profiler looks like this:
After adding this workaround to my test context, everything works again: ((thanks to @redhead, code slightly modified)
package de.kicktipp.testutils;
import lombok.extern.slf4j.Slf4j;
import org.jetbrains.annotations.NotNull;
import org.springframework.beans.BeansException;
import org.springframework.beans.factory.DisposableBean;
import org.springframework.beans.factory.config.BeanPostProcessor;
import org.springframework.stereotype.Component;
import java.util.HashSet;
import java.util.Set;
//https://github.com/spring-projects/spring-framework/issues/26266
@Component
@Slf4j
public class MemoryLeakFix implements BeanPostProcessor, DisposableBean {
private static final String CGLIB_PROXY_STATIC_FIELD_WITH_MEMORY_LEAK = "CGLIB$CALLBACK_FILTER";
private static final String SPRING_CGLIB_PROXY_CLASS_NAME_TAG = "SpringCGLIB";
private final Set<Class<?>> classesToCleanUp = new HashSet<>();
@Override
public Object postProcessAfterInitialization(Object bean, @NotNull String beanName) throws BeansException {
if (bean.getClass().getName().contains(SPRING_CGLIB_PROXY_CLASS_NAME_TAG)) {
classesToCleanUp.add(bean.getClass());
}
return bean;
}
@Override
public void destroy() {
classesToCleanUp.forEach(cls -> {
try {
var field = cls.getDeclaredField(CGLIB_PROXY_STATIC_FIELD_WITH_MEMORY_LEAK);
field.setAccessible(true);
field.set(null, null);
} catch (NoSuchFieldException e) {
log.info("No Field", e);
} catch (IllegalAccessException e) {
log.error(e.getMessage(), e);
}
});
classesToCleanUp.clear();
}
}
I've got a promising solution in the works where we reduce the ProxyCallbackFilter
instance to a plain key role afters its initial use for class generation, stubbing out the Advisors with key stubs (containing advice class and pointcuts just for comparison purposes) once the class has been generated.
While the CGLIB$CALLBACK_FILTER
field remains non-null, its content will not retain any leaky references anymore, independent from the concrete Advisor
implementations in use there. This should hopefully address the common scenarios for long-lived ClassLoaders as outlined above.
Can someone please let me know if the issue that is being discussed here is similar to the one I am observing here?