solr icon indicating copy to clipboard operation
solr copied to clipboard

SOLR-17220 - Make the SolrZkClient thread as a daemon thread

Open amanraj2520 opened this issue 1 year ago • 8 comments

https://issues.apache.org/jira/browse/SOLR-17220

Description

I am submitting a SparkSql job through spark-submit. Spark version 3.3.1 and Kyuubi version 1.8.0. I am using Open Source Spark Engine with Kyuubi Authz module running on top of the Spark Driver in client mode. The Spark job is successful, but the Spark Driver does not stop and keeps on running and I see the PolicyRefresher keeps on polling policies from Ranger. If you see the logs, PolicyRefresher is still running even after Spark Context has stopped. This is leading to the Spark Driver not ending and therefore after sometime I have to manually kill the job. The issue is that the SolrZkClient used for logging to Solr currently opens up a non-daemon thread on top of Spark Driver. So when Spark Context stops, SolrZkClient does not let Spark Driver to exit since currently it is of type non-daemon.

Solution

The fix for this is to make SolrZkClient as a daemon thread.

Tests

Patched the solr jar with this change and verified Spark is able to gracefully exit.

Checklist

Please review the following and check all that apply:

  • [x] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • [x] I have created a Jira issue and added the issue ID to my pull request title.
  • [x] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
  • [ ] I have developed this patch against the main branch.
  • [x] I have run ./gradlew check.
  • [x] I have added tests for my changes.
  • [ ] I have added documentation for the Reference Guide

amanraj2520 avatar Mar 28 '24 16:03 amanraj2520

@mkhludnev @epugh @HoustonPutman Can you please review this change.

amanraj2520 avatar Mar 28 '24 16:03 amanraj2520

This seems like a straightforward change, but honestly, Java and threads terrify me, so maybe someone else could chime in?

epugh avatar Mar 28 '24 19:03 epugh

This changes ALL thread creations that use SolrNamedThreadFactory. I don't think that's the intention.

Furthermore, SolrZkClient has an isClosed flag set from the close() method. I didn't look in detail but I suspect we want to do an orderly shutdown and not just brutally kill the thread.

I'm afraid the change to solve your issue would have to be implemented differently.

murblanc avatar Mar 29 '24 09:03 murblanc

+1 to murblanc -- SolrClient.close must be called

dsmiley avatar Mar 30 '24 18:03 dsmiley

Sure @murblanc @dsmiley Will check this

amanraj2520 avatar Apr 01 '24 02:04 amanraj2520

@murblanc @dsmiley So you mean that when the Spark driver JVM ends, Spark should actually call the SolrZkClient.close right? Should it be implemented using Shutdown Hooks?

Also, if I call plugin.getAuditProviderFactory().shutdown() in the shutdownHook registered (as soon as spark driver ends), will it call SolrZkClient.close eventually?

amanraj2520 avatar Apr 01 '24 04:04 amanraj2520

Spark should actually call the SolrZkClient.close right?

Yes.

We aren't Spark experts here, but we are Solr experts :-). If you don't know about lifecycle matters in Spark then maybe find a support resource for that (or ask ChatGPT).

dsmiley avatar Apr 01 '24 16:04 dsmiley

This PR had no visible activity in the past 60 days, labeling it as stale. Any new activity will remove the stale label. To attract more reviewers, please tag someone or notify the [email protected] mailing list. Thank you for your contribution!

github-actions[bot] avatar Jun 05 '24 00:06 github-actions[bot]

This PR is now closed due to 60 days of inactivity after being marked as stale. Re-opening this PR is still possible, in which case it will be marked as active again.

github-actions[bot] avatar Oct 07 '24 00:10 github-actions[bot]