solr
solr copied to clipboard
SOLR-17220 - Make the SolrZkClient thread as a daemon thread
https://issues.apache.org/jira/browse/SOLR-17220
Description
I am submitting a SparkSql job through spark-submit. Spark version 3.3.1 and Kyuubi version 1.8.0. I am using Open Source Spark Engine with Kyuubi Authz module running on top of the Spark Driver in client mode. The Spark job is successful, but the Spark Driver does not stop and keeps on running and I see the PolicyRefresher keeps on polling policies from Ranger. If you see the logs, PolicyRefresher is still running even after Spark Context has stopped. This is leading to the Spark Driver not ending and therefore after sometime I have to manually kill the job. The issue is that the SolrZkClient used for logging to Solr currently opens up a non-daemon thread on top of Spark Driver. So when Spark Context stops, SolrZkClient does not let Spark Driver to exit since currently it is of type non-daemon.
Solution
The fix for this is to make SolrZkClient as a daemon thread.
Tests
Patched the solr jar with this change and verified Spark is able to gracefully exit.
Checklist
Please review the following and check all that apply:
- [x] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
- [x] I have created a Jira issue and added the issue ID to my pull request title.
- [x] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
- [ ] I have developed this patch against the
mainbranch. - [x] I have run
./gradlew check. - [x] I have added tests for my changes.
- [ ] I have added documentation for the Reference Guide
@mkhludnev @epugh @HoustonPutman Can you please review this change.
This seems like a straightforward change, but honestly, Java and threads terrify me, so maybe someone else could chime in?
This changes ALL thread creations that use SolrNamedThreadFactory. I don't think that's the intention.
Furthermore, SolrZkClient has an isClosed flag set from the close() method. I didn't look in detail but I suspect we want to do an orderly shutdown and not just brutally kill the thread.
I'm afraid the change to solve your issue would have to be implemented differently.
+1 to murblanc -- SolrClient.close must be called
Sure @murblanc @dsmiley Will check this
@murblanc @dsmiley So you mean that when the Spark driver JVM ends, Spark should actually call the SolrZkClient.close right? Should it be implemented using Shutdown Hooks?
Also, if I call plugin.getAuditProviderFactory().shutdown() in the shutdownHook registered (as soon as spark driver ends), will it call SolrZkClient.close eventually?
Spark should actually call the SolrZkClient.close right?
Yes.
We aren't Spark experts here, but we are Solr experts :-). If you don't know about lifecycle matters in Spark then maybe find a support resource for that (or ask ChatGPT).
This PR had no visible activity in the past 60 days, labeling it as stale. Any new activity will remove the stale label. To attract more reviewers, please tag someone or notify the [email protected] mailing list. Thank you for your contribution!
This PR is now closed due to 60 days of inactivity after being marked as stale. Re-opening this PR is still possible, in which case it will be marked as active again.