beam icon indicating copy to clipboard operation
beam copied to clipboard

[Bug]: Go SDK Dataflow jobs fail on DataSampling disabled

Open lostluck opened this issue 2 years ago • 7 comments

What happened?

Dataflow is making DataSampling FnAPI requests even when DataSampling is disabled. But since the feature wasn't enabled, the Go SDK isn't initialzing the datasampler, leading to nil pointer panic.

2023-12-08 16:28:11.597 PST
panic({0x1264220?, 0x2786ff0?}) 
2023-12-08 16:28:11.597 PST
	runtime/panic.go:914 +0x21f 
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.(*DataSampler).getAllSamples(0x0) 
2023-12-08 16:28:11.597 PST
	github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec/datasampler.go:79 +0x4b 
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.(*DataSampler).GetSamples(0x1312c00?, {0x0?, 0xc000234480?, 0xc00053b508?}) 
2023-12-08 16:28:11.597 PST
	github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec/datasampler.go:62 +0x1d 
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness.(*control).handleInstruction(0xc000318000, {0x180beb0?, 0xc000091c80?}, 0xc000147ef0) 
2023-12-08 16:28:11.597 PST
	github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/harness.go:668 +0x116f 
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness.MainWithOptions.func4({0x180beb0?, 0xc000091c80?}, 0xc000091c80?) 
2023-12-08 16:28:11.597 PST
	github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/harness.go:202 +0x74 
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness.MainWithOptions({0x180beb0, 0xc000091c20}, {0x7fffd16304e4, 0xf}, {0x7fffd1630507, 0xf}, {{0xc000225e70, 0x1, 0x1}, {0xc000046010, ...}}) 
2023-12-08 16:28:11.597 PST
	github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/harness.go:222 +0x1022 
2023-12-08 16:28:11.597 PST
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/init.hook() 
2023-12-08 16:28:11.597 PST
	github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness/init/init.go:144 +0x50c 

Easy enough fix, not caught sooner because we didn't run the Dataflow Go Postcommits.

First: Fix the issue, and cherry pick it into 2.53.0 Second: While this is very unlikely, this would have been caught by a simple Dataflow Go Wordcount test as a pre-commit. I'll add that.

Issue Priority

Priority: 1 (data loss / total loss of function)

Issue Components

  • [ ] Component: Python SDK
  • [ ] Component: Java SDK
  • [X] Component: Go SDK
  • [ ] Component: Typescript SDK
  • [ ] Component: IO connector
  • [ ] Component: Beam YAML
  • [ ] Component: Beam examples
  • [ ] Component: Beam playground
  • [ ] Component: Beam katas
  • [ ] Component: Website
  • [ ] Component: Spark Runner
  • [ ] Component: Flink Runner
  • [ ] Component: Samza Runner
  • [ ] Component: Twister2 Runner
  • [ ] Component: Hazelcast Jet Runner
  • [X] Component: Google Cloud Dataflow Runner

lostluck avatar Dec 13 '23 21:12 lostluck

cc: @rohdesamuel @zechenj18

lostluck avatar Dec 13 '23 21:12 lostluck

Thanks for opening this!

rohdesamuel avatar Dec 13 '23 21:12 rohdesamuel

Waiting for https://github.com/apache/beam/actions/runs/7201639105 to complete, and once that's verified, I'll make a cherry pick PR for it for the release.

lostluck avatar Dec 13 '23 22:12 lostluck

Last outstanding thing here is the Dataflow Smoke test precommit.

lostluck avatar Dec 14 '23 18:12 lostluck

This is non-blocking but I've punted it's release over.

lostluck avatar Jan 17 '24 21:01 lostluck

Can we drop this to P2?

damccorm avatar Feb 06 '24 15:02 damccorm

Agreed.

lostluck avatar Feb 06 '24 18:02 lostluck

It seems this issue was resolved? The linked workflow run https://github.com/apache/beam/issues/29760#issuecomment-1854793020 was successful

Abacn avatar Mar 06 '24 17:03 Abacn