SOLR-16046: Thread leak in TestLeaderElectionZkExpiry
https://issues.apache.org/jira/browse/SOLR-16046
- Move
Thread killerto an executor just to make sure it gets cleaned up - Ensure
SolrZkClient zc = new SolrZkClient(server.getZkAddress(), LeaderElectionTest.TIMEOUT))is closed with try w/ resources - reorganize some of the try/finally to ensure things are closed in the reverse order of when they were opened
LGTM. I was looking at SOLR-16046 as well and opened #842, which I think should complement this PR nicely.
One thing I found myself wondering, wrt the killer thread as it exists on main at the moment (and as it has existed since its inception): why is it a separate thread at all?
Thread t = new Thread(() -> doStuff());
t.start();
t.join();
... why not just do?:
doStuff();
It should execute in the same order, unless there's something "magical" about this being done from a different thread?
I'm glad you were looking at this. I had started some work on it locally on top of Houston's prior work (#690), that I just pushed up to https://github.com/madrob/solr/commits/zk-expiry-test so that we can compare notes. My tests were still failing though, IIRC with some new and exciting thread leaks and also some of the original ones that I couldn't reliably reproduce but would happen occasionally. So what you have is probably better. I'll pull your code down and try to run tests on it a few [hundred] times.
Hey @risdenk should we still try and get this in?
It should execute in the same order, unless there's something "magical" about this being done from a different thread?
@magibney I think this is because it is trying to expire the ZK session in the background?
should we still try and get this in?
@HoustonPutman we can. I don't think this makes things worse? At least based on the discussion in SOLR-16046 I'm pretty sure it won't fix everything.