accumulo icon indicating copy to clipboard operation
accumulo copied to clipboard

Broken or Flaky test: ZookeeperRestartIT

Open ctubbsii opened this issue 3 years ago • 5 comments

Test name(s)

  • org.apache.accumulo.test.functional.ZookeeperRestartIT.test

Describe the failure observed

[ERROR] org.apache.accumulo.test.functional.ZookeeperRestartIT.test  Time elapsed: 30.898 s  <<< ERROR!
java.lang.RuntimeException: Unable to read instance id from zookeeper.
	at org.apache.accumulo.miniclusterImpl.MiniAccumuloClusterImpl.verifyUp(MiniAccumuloClusterImpl.java:674)
	at org.apache.accumulo.miniclusterImpl.MiniAccumuloClusterImpl.start(MiniAccumuloClusterImpl.java:617)
	at org.apache.accumulo.test.functional.ZookeeperRestartIT.test(ZookeeperRestartIT.java:81)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/instances
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
	at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2589)
	at org.apache.accumulo.miniclusterImpl.MiniAccumuloClusterImpl.verifyUp(MiniAccumuloClusterImpl.java:665)
	... 16 more

Testing Environment:

  • Version of this project: 2.1.0-SNAPSHOT
  • First commit known to fail (or current commit): 4ce9ab25d061a7cb1f79b49746637b91c22c40ec
Executing Maven:  -B -f /var/lib/jenkins/workspace/2.1/pom.xml -V verify -Dfailsafe.rerunFailingTestsCount=5
Apache Maven 3.8.1 (05c21c65bdfed0f71a2f2ada8b84da59348c4c5d)
Maven home: /var/lib/jenkins/tools/hudson.tasks.Maven_MavenInstallation/Maven_3.8.1
Java version: 11.0.14, vendor: Red Hat, Inc., runtime: /usr/lib/jvm/java-11-openjdk-11.0.14.0.9-2.fc35.x86_64
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "5.16.8-200.fc35.x86_64", arch: "amd64", family: "unix"

What have you tried already? Jenkins reran the test and it passed the second time.

Additional context I don't think I've ever seen this test fail before. It could have been a fluke.

ctubbsii avatar Feb 16 '22 18:02 ctubbsii

Just for reference, I have been periodically running this IT past week or so and I have yet to see it fail.

Manno15 avatar Feb 26 '22 20:02 Manno15

I'm thinking we might be able to close this as OBE.

dlmarion avatar Jun 30 '22 15:06 dlmarion

Haven't seen it in awhile. Will close.

ctubbsii avatar Jul 25 '22 23:07 ctubbsii

I am not sure if it this problem, or a similar issue with getting the instance id from hdfs but running the full tests with github actions, sometimes there are issues with the test getting the instance id.

One stack trace I could find (run on Jun 10):

[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 27.464 s <<< FAILURE! - in org.apache.accumulo.test.functional.CombinerIT
[ERROR] org.apache.accumulo.test.functional.CombinerIT.aggregationTest  Time elapsed: 26.849 s  <<< ERROR!
java.lang.IllegalStateException: Unable to find instance id from zookeeper.
	at org.apache.accumulo.miniclusterImpl.MiniAccumuloClusterImpl.verifyUp(MiniAccumuloClusterImpl.java:670)
	at org.apache.accumulo.miniclusterImpl.MiniAccumuloClusterImpl.start(MiniAccumuloClusterImpl.java:614)
	at org.apache.accumulo.harness.AccumuloClusterHarness.setupCluster(AccumuloClusterHarness.java:167)

EdColeman avatar Jul 25 '22 23:07 EdColeman

Hmm, that does look related.

ctubbsii avatar Jul 26 '22 02:07 ctubbsii

Closing, as I haven't seen this in awhile.

ctubbsii avatar Nov 29 '23 17:11 ctubbsii