beam icon indicating copy to clipboard operation
beam copied to clipboard

[Bug]: Add check for self-referencing input in YAML transform

Open Polber opened this issue 5 months ago • 0 comments

What happened?

The following pipeline will fail

pipeline:
  transforms:
    - type: Create
      name: Source
      config:
        elements:
          - id: 1
      input: Source
    - type: LogForTesting
      input: Source

with following error:

  ...
  File "/Users/jkinard/beam/sdks/python/apache_beam/yaml/yaml_transform.py", line 163, in strip_metadata
    if isinstance(spec, Mapping):
       ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jkinard/.pyenv/versions/3.11.6/lib/python3.11/typing.py", line 1305, in __instancecheck__
    return self.__subclasscheck__(type(obj))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jkinard/.pyenv/versions/3.11.6/lib/python3.11/typing.py", line 1583, in __subclasscheck__
    return issubclass(cls, self.__origin__)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded in __subclasscheck__

Due to self-referencing transform - this should be a more clear error and caught earlier

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • [ ] Component: Python SDK
  • [ ] Component: Java SDK
  • [ ] Component: Go SDK
  • [ ] Component: Typescript SDK
  • [ ] Component: IO connector
  • [X] Component: Beam YAML
  • [ ] Component: Beam examples
  • [ ] Component: Beam playground
  • [ ] Component: Beam katas
  • [ ] Component: Website
  • [ ] Component: Infrastructure
  • [ ] Component: Spark Runner
  • [ ] Component: Flink Runner
  • [ ] Component: Samza Runner
  • [ ] Component: Twister2 Runner
  • [ ] Component: Hazelcast Jet Runner
  • [ ] Component: Google Cloud Dataflow Runner

Polber avatar Aug 27 '24 19:08 Polber