[BUG] Quite often integration tests fail with `Address already in use message` message
What is the bug? Our integration tests quite often fail with such a message:
org.opensearch.security.PrivilegesEvaluationTest > resolveTestHidden FAILED
java.lang.RuntimeException: Could not start all nodes BindHttpException[Failed to bind to [::1]:9017]; nested: BindException[Address already in use];
at __randomizedtesting.SeedInfo.seed([2A8A334058CF3506:372891963547AE92]:0)
at org.opensearch.security.test.helper.cluster.ClusterHelper.startCluster(ClusterHelper.java:273)
at org.opensearch.security.test.helper.cluster.ClusterHelper.startCluster(ClusterHelper.java:124)
at org.opensearch.security.test.SingleClusterTest.setup(SingleClusterTest.java:107)
at org.opensearch.security.test.SingleClusterTest.setup(SingleClusterTest.java:83)
at org.opensearch.security.test.SingleClusterTest.setup(SingleClusterTest.java:69)
at org.opensearch.security.PrivilegesEvaluationTest.resolveTestHidden(PrivilegesEvaluationTest.java:30)
which is a bug. It needs further investigation but I suspect that it is due to the fact that we use the new and the old framework for LocalCluster tests and ports allocation logic is the same for both frameworks.
Do you have any screenshots? If applicable, add screenshots to help explain your problem.
Do you have any additional context? Add any other context about the problem.
[Triage] Hi @willyborankin, thanks for filing this issue. Looks like some of the tests are not cleaning up properly. Someone will need to take a look to help stop this from becoming a frequent issue.
I believe we could mitigate this by refactoring this code to be usable in all the tests cases:
https://github.com/opensearch-project/security/blob/main/src/integrationTest/java/org/opensearch/test/framework/cluster/PortAllocator.java
I investigated some failures. So far it looks like that old framework has such problems.