flink icon indicating copy to clipboard operation
flink copied to clipboard

[FLINK-37365][pyflink] Add API stability decorators for PyFlink APIs

Open autophagy opened this issue 9 months ago • 4 comments

What is the purpose of the change

Currently, the Java APIs use annotations like @PublicEvolving and @Experimental to clearly indicate the stability status of API components. This is super helpful for understanding which parts of the API are stable for production use and which might change in future releases.

The Python APIs lack this kind of explicit stability indication. This PR adds a set of decorators, corresponding to the annotations used in the Java APIs, that allow us to decorate classes/functions depending on their stability. These decorators also modify the underlying docstrings of the classes/functions that they decorate, so that the API stability is also reflected in the Sphinx documentation. Additionally, classes/functions that are deprecated output a warning at runtime on their invocation.

The decorators function at both the class and function level. At the function level, they enrich the docstring of the function with an rst block that is rendered as a directive in the Sphinx documentation for that function. At the class level, they enrich the docstring of the class and also the docstrings of the classes public functions, so that, for example, the function documentation for a class annotated as PublicEvolving() also contains a directive block informing the user that the class the function is part of is marked as public evolving. The following decorators, and their resulting documentation changes:

Decorators

Deprecated

Deprecated is distinct from the other decorators in that it takes a set of arguments, a mandatory since argument and an optional detail argument. These are output as a Sphinx deprecated directive, hence the mandatory since argument.

Without detail:


@Deprecated(since="1.2.3")
def MyClass:
    pass

Output in the Sphinx documentation: Screenshot 2025-03-04 at 18 06 59

With detail:


@Deprecated(since="1.2.3", detail="Lorem ipsum dolor sit amet.")
def MyClass:
    pass

Output in the Sphinx documentation: Screenshot 2025-03-04 at 18 08 02

Experimental


@Experimental()
def MyClass:
    pass

Output in the Sphinx documentation: Screenshot 2025-03-04 at 18 06 50

Internal


@Internal()
def MyClass:
    pass

Output in the Sphinx documentation: Screenshot 2025-03-04 at 18 06 39

PublicEvolving


@PublicEvolving()
def MyClass:
    pass

Output in the Sphinx documentation: Screenshot 2025-03-04 at 18 06 33

Combining Decorators

The decorators can also be combined. Multiple decorators can be applied to a class/function, and this will output the directives of all the decorators in the documentation for that class/function. Decorators on classes and then on functions in those classes will also combine, so that the decoration from the class is propagated down to the function and combined with the annotations on that function. So, for example, the following code:


@PublicEvolving()
class ResolvedSchema(object):
    """
    Schema of a table or view consisting of columns, constraints, and watermark specifications.

    This class is the result of resolving a :class:`~pyflink.table.Schema` into a final validated
    representation.

    - Data types and functions have been expanded to fully qualified identifiers.
    - Time attributes are represented in the column's data type.
    - :class:`pyflink.table.Expression` have been translated to
      :class:`pyflink.table.ResolvedExpression`

    This class should not be passed into a connector. It is therefore also not serializable.
    Instead, the :func:`to_physical_row_data_type` can be passed around where necessary.
    """

    @Experimental()
    @Deprecated(since="1.2.3", detail="Use :func:`better_get_columns`")
    def get_columns(self) -> List[Column]:
        pass

Will produce the following on the documentation for the function ResolvedSchema.get_columns: Screenshot 2025-03-04 at 17 56 30

Brief change log

  • Added PublicEvolving/Deprecated/Internal/Experimental decorators for Pyflink APIs.
  • Aligned Python Table API object's API annotations with those on the Java side.

Verifying this change

This change added tests and can be verified as follows:

  • Building the documentation and observing the output on class/function autosummary docs.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)

autophagy avatar Mar 04 '25 17:03 autophagy

CI report:

  • cd0c845d7d735bd70f631e1e6147e4bd93c57c64 Azure: SUCCESS
Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

flinkbot avatar Mar 04 '25 18:03 flinkbot

I only added the annotations to the Table side because im a little more familiar with them than the datastream side and thought it would be good to gather feedback. I can add it as part of this PR or as a followup.

autophagy avatar Mar 05 '25 08:03 autophagy

Yeah, that was a question I had and wasnt sure how to answer - since everything in Python is more-or-less public, I wasn't sure what the utility of a Public annotation would be for the Python interfaces. Similarly, should it output a documentation block or do we want to assume that it's a no-op annotation?

autophagy avatar May 07 '25 08:05 autophagy

I have the only one comment, others seems LGTM :+1:

snuyanzin avatar May 12 '25 08:05 snuyanzin