beam
beam copied to clipboard
[Bug]: Flink fails with "cannot handle nodes with more than 64 outputs"
What happened?
I am using Tensorflow Transform with Flink as the Beam runner, and the versions I am using are Beam 2.48.0 and Flink 1.16.2. After adding too many features, the Flink job cannot be submitted due to the exception
org.apache.flink.optimizer.CompilerException: Cannot currently handle nodes with more than 64 outputs
This is coming from this line: https://github.com/apache/flink/blob/fb4d4ef57439cd7fde4fb5a4972733d13e69a226/flink-optimizer/src/main/java/org/apache/flink/optimizer/dag/OptimizerNode.java#L356
I am not entirely sure whether it is the responsibility of Beam or Tensorflow Transform to fix this, but at a glance it seems like something that should be fixed within Beam by not adding more than 64 outgoing connections and instead handling this case in some other way.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
- [ ] Component: Python SDK
- [ ] Component: Java SDK
- [ ] Component: Go SDK
- [ ] Component: Typescript SDK
- [ ] Component: IO connector
- [ ] Component: Beam examples
- [ ] Component: Beam playground
- [ ] Component: Beam katas
- [ ] Component: Website
- [ ] Component: Spark Runner
- [X] Component: Flink Runner
- [ ] Component: Samza Runner
- [ ] Component: Twister2 Runner
- [ ] Component: Hazelcast Jet Runner
- [ ] Component: Google Cloud Dataflow Runner
I am also facing the same issue, do we have any fix for this?