spring-framework
spring-framework copied to clipboard
Parallel bean initialization during startup [SPR-8767]
Tomasz Nurkiewicz opened SPR-8767 and commented
Spring should provide a way (possibly a BeanFactory
with a different ConfigurableListableBeanFactory#preInstantiateSingletons
implementation) to initialize singleton non-lazy beans on startup in parallel using a thread pool. This could significantly reduce startup (and maybe shutdown) time by creating and initializing independent beans concurrently.
The algorithm is pretty simple in principle. Whereas the normal bean factory creates beans in single thread in rather random order, this implementation should:
- Find all bean definitions that don't have any unresolved dependencies.
- Schedule creation of each bean found in 1. in a separate concurrent task to allow parallel creation
- When any of the tasks scheduled in 2. is completed go to 1.
The algorithm stops when all beans are created.
Implementation notes:
- circular dependencies might require extra care
- bean factory should create a thread pool with configurable number of threads and shut it down after all beans are created
Affects: 3.1 RC1
Reference URL: http://forum.springsource.org/showthread.php?105896-Initialize-spring-beans-in-parallel-at-startup
Issue Links:
- #14520 Parallelize Component Scanning to Improve Spring Startup Time ("is duplicated by")
- #10033 Threads bottlenecking in DefaultSingletonBeanRegistry when using Wicket's
@SpringBean
annotation for injection - #14520 Parallelize Component Scanning to Improve Spring Startup Time
- #20904 Initialization blocked by multi-threaded event publishing
- #19487 Asynchronous initialization of beans during startup
- #18305 Background initialization option for JPA EntityManagerFactory / Hibernate SessionFactory
- #19398 Add a functional way to register a bean
80 votes, 77 watchers
Chris Beams commented
Hi Tomasz,
Reading through the linked forum thread, Marten's suggestion hits the nail on the head
The problem here, imho, is that you are mixing bean construction and bean initialization.
The latter is something you (judging from your post) want to do in parallel. The easiest approach, I guess, is to create an ApplicationListener which listens to ContextRefreshedEvents. (This is fired when the context is up) and which starts initializing the caches, you could plugin a TaskExecutor for this which utilizes the servers threadpool, which in turn should utilize the underlying hardware to its fullest...
Best of both worlds imho...
(http://forum.springsource.org/showthread.php?105896-Initialize-spring-beans-in-parallel-at-startup&p=352702#post352702)
While it is conceptually straightforward to imagine the container introducing concurrent bean initialization, in practice it would be anything but. We would need to see quite a bit of feedback and demand that Spring container initialization is fundamentally too slow to seriously consider this kind of change. Again, given the specific scenario in the forum post, it appears that the user can make changes that would both solve the startup time problem and probably improve the design of his program in the process.
Feel free to comment further if you think there is a compelling case that's been missed here, and we can leave this open for further comments and votes to that effect. Otherwise I'll close as won't fix for now.
Niklas Schlimm commented
Hi Chris, we're currently working on a migration of over 50 web applications to SpringIOC. Startup time is a big issue 'cause we need quick startup during development and (more important) in production environment. Since Spring does not support concurrent singleton bean instantiation we have a startup time of out complete production environment of ~2h 30 Minutes. which is not acceptable to our operations teams. We just ,igrated from a proprietary container solution that we developed. That proprietary container supported parallel instantiation. Now, we have to argue why we migrated to Spring ... to us, this is not a minor issue and we would highly appreciate any progress on this.
With regards to the solution it should be something like a "managed task executor" 'cause ideally the concurrent threads have the container managed thread context (JNDI, JPA resources must be accessible etc.). Therefore the solution is not so straight formward imho.
Best regards, Niklas
Chris Beams commented
Niklas,
2h 30m is a very long time indeed, but again, I would encourage a more pragmatic approach.
For the majority of Spring applications, container startup time is not an issue. We can be fairly certain of this simply because it is not a complaint we often encounter here in JIRA, in our forums, at conferences or with paying customers. When it does come up as a concern, we usually advise profiling the application to determine exactly which beans are causing the slowdown, and taking specific action to reduce the impact. Marten's suggestion that I quoted above would be perfectly adequate in many cases.
The upside of parallelizing bean initialization in the Spring container could be significant for a minority of applications using Spring, while the downsides - the inevitable bugs, added complexity and unintended side effects - would affect potentially every application using Spring. Not an attractive outlook, I'm afraid.
I'm resolving this issue as Won't Fix because indeed it is very unlikely that we would introduce a change of this magnitude into the core framework at this point without a very strong rationale.
Users are free to reopen this issue and add new comments, and continue to add votes if there are arguments that have not yet been heard.
Niklas Schlimm commented
Hi Chris,
understand your view point. Thanks for the comprehensive reply.
Cheers, Niklas
Adib Saikali commented
I think parallel startup of Spring is very important, in my application spring is taking up 50% of my startup time, reducing that would be very helpful during development a major time saver.
Ragnar Rova commented
Please see https://github.com/gredler/spriths for an experimental example implementation. Seems like the spriths implementation goes far into making it work. The author of spriths mentions #10033 that needs to be solved in order to make parallel startup thread-safe.
It would be nice if the framework at least did not stand in the way of users wanting parallel startup.
Jonatan Jönsson commented
I think a ConfigurableListableBeanFactory#setConcurrentInstantiationOfSingletons(boolean) would be great. Default to non-concurrent but make it super easy to get a parallel version.
Scott Murphy commented
For a single instance web application, the speed of Spring initialization is fine. However, when you have an application that uses 20+ instances, the slowness of Spring initialization begins to a have a detrimental impact on the ability to scale dynamically. We are currently feeling major pain using Spring on App Engine. If Spring supported a multi-threaded start up, we would see a significant improvement in our ability to scale as well benefit from enormous cost savings.
Scott Murphy commented
"Users are free to reopen this issue"
Maybe keep this issue open so it can be voted upon?
Adib Saikali commented
I think this problem has two distinct parts parallel discovery of beans and parallel initialization of beans. Both of which can be implemented separately to improve performance unless component scanning for a large application can not be made any faster.
Julien Dubois commented
Start-up performance is a very important issue to me, and I think a lot of beans could be initialized in parallel. So yes this issue should stay open, it should even have a higher criticity level as far as I'm concerned.
Attila Király commented
Our applications have really big spring contexts with a lof ot spring beans. Our cache initializations are already running in parallel but the applications still take a long time to start up because simply some of our beans require time (for example because loading reference data from various sources, which we need for other bean creation already) in creation/initialization before we can say they are "done".
So we would definitely benefit from a build in bean init parallelization.
Deryl Spielman commented
We are struggling with this as well. Lazy loading beans helped a bit but I believe @Controller
beans end up lazy loading its autowired fields anyway which doesn't improve the speed that much. Also entity manager and other data source beans in the config must be carefully set to eagerly be loaded, which is cumbersome. With all of this talk of microservices and Spring Boot it is obvious that separating out the modules in to separate projects may improve the speed, but this increases the overhead of managing independently deployed and managed services + it is a project in its own to figure out how to refactor the Spring bean autowiring dependencies.
I also attempted to define all beans in XML dynamically at startup using a custom implemented component scanning method found in some of these links. I found that it did improve the speed but not dramatically. The maintenance of ensuring the bean was dynamically created correctly was also cumbersome and did not work completely based on the complexities in having @Scope
and @Qualifier
.
Summary is we need monolith applications to increase their speed time and this ticket should be evaluated! Thanks.
caiwei commented
I'm thinking to enhance below project: https://github.com/gredler/spriths The basic idea of this project is Analyze bean dependencies -> Form DAG -> Parallel Bean Initialize.
To analyze the dependencies, I should handle dependencies that are set via Setter, Constructor and annotations including:
@Resource
TYPE, FIELD, METHOD
@Inject
METHOD, CONSTRUCTOR, FIELD
@AutoWired
METHOD, CONSTRUCTOR, FIELD
For dependencies introduced by ApplicationContextAware, Lookup method injection, Class.forName, Reflect..., may unable to handle...
User is allowed to specify the parallel bean init in a context.xml as below: <context:parallel-bean-init enabled="true" alwaysGenerate="true" dag="dag.xml"/>
a DAG is quite straightforward. <dag> <bean name="A"> <dependOn bean="C"/> </bean> <bean name="B"> <dependOn bean="C"/> <dependOn bean="D"/> </bean> <bean name="C"> <dependOn bean="E"/> </bean> <bean name="D"> <dependOn bean="E"/> </bean> <bean name="E"> <dependOn bean="G"/> </bean> <bean name="F"> <dependOn bean="G"/> </bean> <bean name="G"/> </dag>
User can always let Spring generate a dag file or specify a dag file as they wish(set alwaysGenerate="false"). We know It is almost impossible for users to create the DAG from scratch. They have to know the bean dependencies and find out all beans defined via annotation. The typical use case is user will first generate the DAG, then they may adjust the sequences in DAG if there are any server startup failures caused by the concurrent bean initialization.
Just my 2 cents. Please share your expertise and let me know your concerns.
Juergen Hoeller commented
Note that we put #18305 onto the 4.3 roadmap now, with a dedicated background initialization option in LocalContainerEntityManagerFactoryBean
/ LocalSessionFactoryBean
that allows to run JPA / Hibernate initialization in parallel to all other beans in the context. We do not intend to let the container initialize beans in parallel there; we just allow those two FactoryBean
implementations to internally delegate to a separate thread and lazily access the initialized result through a Future
handle. Such specific background initializations options seem like a sweet spot, with configuration validation and dependency resolution happening as usual and just a well-known expensive bootstrap step delegated to a separate thread internally.
A generalized solution based on a DAG analysis of a container's bean interdependencies is unfortunately absolutely non-trivial. We have plenty of components which dynamically resolve dependencies at runtime or make auto-configuration decisions based on the presence / non-presence of beans at runtime, etc. It seems like asking for a lot of trouble when trying to generalize such parallel bootstrapping as container-level guesswork, with hard-to-predict benefits in comparison to allowing well-known expensive components to use background initialization internally. And based on our experience, a large chunk of the startup time is occupied by very few components even in large Spring applications; tackling those components specifically and seeing how far we get with that seems like a very worthwhile effort.
In any case, the most important part: We are revisiting this topic in 4.3, and we may take it a few steps further in 5.0. If you have specific hotspot insight into the startup time of your applications, please let us know... in particular if it deviates from our assumptions above.
Juergen
Julien Dubois commented
If we could just have a specific annotation, like @Background
to load a bean in the background, it would be nice: a bit like the @Lazy
annotation in fact. Anyway each project is different, and usually people know which beans should run in the background.
There should also be a specific thread pool for those, so we can manage how those background beans run -> like you have 4 CPU cores, launch 6 beans in "background", and run them on 2 threads...
Juergen Hoeller commented
For specifically demarcated beans, it's a much simpler problem for sure. You may still end up with deadlocks in case of dependency cycles etc but hey, why not let badly designed applications hang on startup right away ;-) Seriously, opting in for specific beans is certainly the way to go - in one way or the other.
Depending on the kind of bean, internal delegation to a separate thread can be quite beneficial, with dependency resolution and configuration validation happening first and then e.g. just the buildSessionFactory()
call actually executing in a separate thread. This seems like a pretty sweet spot, as long as nobody tries to call the resulting proxy early.
Whereas for other kinds of beans, a more generic @Background
marker could wrap the entire createBean
step in a separate thread. The question is just what it would return to the immediate caller then, since the bean factory itself is not in the business of creating proxies. We had the same problem with @Lazy
-triggered proxies for injection points, though, and solved it through an SPI call that the context package implements on a proxy basis; I guess we could do something similar here.
In any case, both options sound worth exploring. Let's keep this JIRA ticket open for an @Background
-style model (probably rather 5.0) and #18305 for the specific Hibernate/JPA factory case (certainly 4.3).
Juergen
Rohit Gupta commented
This functionality is really awesome if it comes with the framework as in our case, our architect kept on cursing Spring for delayed startup.
I just need to highlight one thing for parallel loading of beans. Do figure out a way if someone tries to use a bean early as in our case we have our own encryption mechanism which gets initialized early with ROOT application context through a component that once initialized, initializes AWS Component that connects S3, downloads properties, decrypts and provides to the application which is resolved by PropertyPlaceholderConfigurer which is later overridden at the instance level.
I just fear, if parallel initialization occurs, it should not provide wrong properties to @Value
resolved variables. Also, if two beans are interdependent, It should take care.
Like any two services depend on each other during lazy initialization.
DAG thing won't work as if you ask people to generate DAG, in most cases people won't be able to do that perfectly and later will curse the framework. If the framework does that out of the box, then it will be great.
Vladislav Kaverin commented
This issue is one of the most voted of unresolved ones in the Spring Framework JIRA project and is desired by various Spring-users for 5 years already (starting on the ticket creation date). Doesn't it deserve a priority higher than Minor? ;)
Juergen Hoeller commented
Alright, bumping this one to "Major", keeping it in the 5.0 backlog.
Please note that we've been shipping 4.3 with #18305 included already. From our perspective, this addresses the most common case - expensive persistence provider bootstrapping - in a reasonably straightforward way.
Vladislav Kaverin commented
@juergen
.hoeller, please add performance
and startup
labels to the ticket, it would help tracking such issues.
Abhijit Sarkar commented
With 61 votes, and after 6.5 years since it was opened, doesn't his ticket deserve more attention than it is getting (which is zero, last comment was a year ago)?
Filip Panovski commented
Has there been any progress on this issue? Is this planned for any release's roadmap as of right now?
Too much expectation!
Hi, has this problem been solved? @spring-issuemaster thanks~ It takes 2 minutes for our database resources to load; the startup time is too long. . It takes 5 ~ 6 minutes to start the entire application.
Is there any new progress? Thank you!
Hi is there any updates...? It has been 9 years :( Spring is sooooo powerful and widely used so IMHO this feature can make thousands (if not millions) of developers and ops' life better!
please do this issue, it's really important!!
Parallel bean initialization issue has bean 10 years!!! Expecting General Backlog can be released as quickly as possible!!! 👍 💯