Difficulty reprocessing failed records
What happened?
https://github.com/vaishnavipandey-vp/Apache-Beam-YAML/blob/c467c59f5f13a00514e90be1cfd505982577524b/Apache-Beam-YAML/Pipelines/exceptionHandlingMultiple.yaml#L19C1-L28C35
Not able to process the failed records obtained from the previous transform as it is generated in a JSON format. We are required to extract the element values of the failed records from the JSON file in order to process it further.
Issue Failure
Failure: Test is continually failing
Issue Priority
Priority: 1 (unhealthy code / failing or flaky postcommit so we cannot be sure the product is healthy)
Issue Components
- [ ] Component: Python SDK
- [ ] Component: Java SDK
- [ ] Component: Go SDK
- [ ] Component: Typescript SDK
- [ ] Component: IO connector
- [X] Component: Beam YAML
- [ ] Component: Beam examples
- [ ] Component: Beam playground
- [ ] Component: Beam katas
- [ ] Component: Website
- [ ] Component: Infrastructure
- [ ] Component: Spark Runner
- [ ] Component: Flink Runner
- [ ] Component: Samza Runner
- [ ] Component: Twister2 Runner
- [ ] Component: Hazelcast Jet Runner
- [ ] Component: Google Cloud Dataflow Runner
I created https://github.com/apache/beam/pull/32874 to make the documentation a bit clearer.
I started a discussion at https://lists.apache.org/thread/jhgsp1rw7l1xr6knpjl9mhvnjwjf3zwf about how to possibly make this easier.
This is now resolved with the StripErrorMetadata transform, see also https://beam.apache.org/documentation/sdks/yaml-errors/