beam icon indicating copy to clipboard operation
beam copied to clipboard

PulsarIOTest.testReadFromSimpleTopic is very flaky

Open damccorm opened this issue 2 years ago • 13 comments

PulsarIOTest uses shared client across all tests [1], which is a race conditions when multiple tests run in parallel.  This manifests as frequent test failures.

[1] https://github.com/apache/beam/blob/master/sdks/java/io/pulsar/src/test/java/org/apache/beam/sdk/io/pulsar/PulsarIOTest.java#L62

Imported from Jira BEAM-14269. Original Jira may contain additional context. Reported by: SteveNiemitz.

damccorm avatar Jun 05 '22 01:06 damccorm

Unable to assign user @MarcoRob. If able, self-assign, otherwise tag @damccorm so that he can assign you. Because of GitHub's spam prevention system, your activity is required to enable assignment in this repo.

damccorm avatar Jun 05 '22 01:06 damccorm

Hi @damccorm I am unable to self-assign, can you help me assign this issue to me? Thanks!

MarcoRob avatar Jun 08 '22 16:06 MarcoRob

Sure! As of https://github.com/apache/beam/pull/21719 you should be able to self assign with the command .take-issue though, so you should be able to do that going forward! Let me know if you run into any issues with that

damccorm avatar Jun 08 '22 16:06 damccorm

@MarcoRob are you working on this?

pabloem avatar Oct 18 '22 17:10 pabloem

@pabloem There was a draft PR #17473 seems not having updates for a while.

Abacn avatar Oct 18 '22 17:10 Abacn

hm from #17473 it seems more complex, huh? we'll see what I can do...

pabloem avatar Oct 18 '22 17:10 pabloem

hm from #17473 it seems more complex, huh? we'll see what I can do...

That seems right. It indicates issues for the IO connector itself.

Abacn avatar Oct 18 '22 17:10 Abacn

@MarcoRob are you working on this?

Hi @pabloem Yes I am working on it, just right now I am working on the migration of Github Actions (GA)

I consulted with Pulsar community and seems like the issue comes from the function that seeks(timestamp) the message from a timestamp it fails sometimes in retrieving the right message with the correct timestamp, so they advice me to use seek(MessageId) instead. So I am working on changing the function, but right now I paused this while GA migration is finished.

You can see I got some inputs on that issue in this PR-17473

MarcoRob avatar Oct 18 '22 18:10 MarcoRob

gotcha. thanks @MarcoRob !

pabloem avatar Oct 18 '22 18:10 pabloem

Here another occurrence: https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/24746/

kennknowles avatar Nov 12 '22 00:11 kennknowles

Still happening: https://ci-beam.apache.org/job/beam_PreCommit_Java_Phrase/5782/consoleFull

14:56:31 > Task :sdks:java:io:pulsar:test
14:56:31 
14:56:31 org.apache.beam.sdk.io.pulsar.PulsarIOTest > testReadFromSimpleTopic FAILED
14:56:31     org.apache.beam.sdk.Pipeline$PipelineExecutionException at PulsarIOTest.java:178
14:56:31         Caused by: java.lang.IllegalArgumentException at Preconditions.java:440

apilloud avatar Nov 28 '22 23:11 apilloud

@MarcoRob are you no longer working on this?

kennknowles avatar Feb 13 '23 21:02 kennknowles

I'm going to unassign this for now, then. Feel free to comment or grab it again.

kennknowles avatar Apr 27 '23 17:04 kennknowles