beam icon indicating copy to clipboard operation
beam copied to clipboard

[Failing Test]: Utilization of 'Source' and 'Sink'

Open ujjwalrajanand opened this issue 1 year ago • 1 comments

What happened?

https://github.com/vaishnavipandey-vp/Apache-Beam-YAML/blob/c467c59f5f13a00514e90be1cfd505982577524b/Apache-Beam-YAML/Pipelines/transformUsingSourceSink.yaml#L1C1-L24C17

Not able to utilize 'source' and 'sink' transforms, facing challenges.

Issue Failure

Failure: Test is continually failing

Issue Priority

Priority: 1 (unhealthy code / failing or flaky postcommit so we cannot be sure the product is healthy)

Issue Components

  • [ ] Component: Python SDK
  • [ ] Component: Java SDK
  • [ ] Component: Go SDK
  • [ ] Component: Typescript SDK
  • [ ] Component: IO connector
  • [X] Component: Beam YAML
  • [ ] Component: Beam examples
  • [ ] Component: Beam playground
  • [ ] Component: Beam katas
  • [ ] Component: Website
  • [ ] Component: Infrastructure
  • [ ] Component: Spark Runner
  • [ ] Component: Flink Runner
  • [ ] Component: Samza Runner
  • [ ] Component: Twister2 Runner
  • [ ] Component: Hazelcast Jet Runner
  • [ ] Component: Google Cloud Dataflow Runner

ujjwalrajanand avatar Oct 18 '24 18:10 ujjwalrajanand

Your source and sink parameters should contain a single transform not a list of them (even if this is a list of size 1). We could look into providing a better error here.

robertwb avatar Oct 18 '24 21:10 robertwb

.take-issue

joseph6x avatar Mar 28 '25 05:03 joseph6x

Hi @robertwb

I would like to work on this issue, my proposal is the following:

  1. Improve the error message.
  • Modify the _closest_line function so it returns both the line number and the last key of the path.
  • Modify the exception message to this: exn.message = f"Error found on key '{key}' around line {line}. Cause : {exn.message}".

The resulting errors will look like this:

jsonschema.exceptions.ValidationError: Error found on key 'source' around line 2. Cause : [{'type': 'ReadFromCsv', 'name': 'ReadMyData', 'config': {'path': 'D:\\\\Programs\\\\Apache-Beam-YAML\\\\Datasets\\\\sample.csv', '__line__': 6, '__uuid__': 'a7ecf8a4-aa80-4e74-8b42-784ee4d9c88e'}, '__line__': 3, '__uuid__': 'd785ab49-9064-448c-8a94-7330b49e0bf1'}] is not of type 'object'
...
  1. Update the website section Source and sink transforms of the Beam YAML API page in order to explicitly state that both sourceand sink are intented to be objects in YAML.

joseph6x avatar Mar 28 '25 05:03 joseph6x

This issue has been marked as stale due to 150 days of inactivity. It will be closed in 30 days if no further activity occurs. If you think that’s incorrect or this issue still needs to be addressed, please simply write any comment. If closed, you can reopen the issue at any time. Thank you for your contributions.

github-actions[bot] avatar Aug 25 '25 12:08 github-actions[bot]

This issue has been closed due to lack of activity. If you think that is incorrect, you can reopen the issue at any time.

github-actions[bot] avatar Sep 25 '25 12:09 github-actions[bot]