Add more reliable resource leak detector for tests
See docstrings in aleph.resource-leak-detector for details.
This patch also includes the following changes:
- Rename the
:leak-level-paranoidLeiningen profile to:leak-detectionsince it now also sets everything up to use the new custom detector. - Introduce new test selector
:leakfor tests which intentionally leak (not run by default since they require the:leak-detectionprofile to be active, too). - Introduce intentionally leaking
aleph.http-test/test-leak-in-raw-stream-handleras an example and to document a case where users are responsible for releasing buffers. - Eliminate all
(netty/leak-detector-level! :paranoid)calls from tests. Use new leak detection API instead (see below). - Place
(aleph.resource-leak-detector/instrument-tests!)at the end of all relevant test namespaces to detect leaks (only effective when running with:leak-detectionprofile).
Ideally we'd activate it in the CI test job but we first need to fix the leak in aleph.http-test/test-idle-timeout for that (see #707).
Based on the research results from #615.
Hmm aleph.http-timeout-test/test-shutdown-timeout-1 fails consistently in CI for me while it works fine locally. Looks like there is a race condition there where the client has not yet established a connection before the server shuts down after the 50ms wait? I can't think of a way to make this more reliable, though... @pyr Since you are the author, do you have an idea?
FWIW I also pushed a branch with only a bogus change to trigger the job and this one also fails so it's not related to any change in this here patch.
@DerGuteMoritz Nothing jumps at me, works locally here as well
Do you want to reevaluate after merging #709 ?
Well, that fixed it but now the DNS failure tests started timing out, as well (also on master). See https://github.com/clj-commons/aleph/pull/710 for an improvement.
OK all green now :sweat_drops: Please give it a try!
I'm not sure it will fly long-term (the strategy around calling the GC - since it's just a hint. And manually running the finalizations).
@arnaudgeiser Now that you mention it: running finalizers should actually not be needed, I only threw runFinalization in there for good measure. Netty does use a handful of finalizers but the leak detector uses weak references and a reference queue instead. I think leaving it in there for now doesn't hurt for now but having to remove it shouldn't be a problem either - if it ever gets removed completely, Netty will have to adjust as well, anyway. And by then Netty 5 migh thave rolled around which AFAIK gets rid of reference counting anyway :shrug:
But it improves the current situation and it's just for testing purpose. Let's see what will be the future.
Right, improving the current testing situation is the main goal here :+1:
Thanks for the approach!
You're welcome and thanks for your input on the original ticket :pray:
https://openjdk.org/jeps/421
Ah, good to know, wasn't aware of this!