airflow icon indicating copy to clipboard operation
airflow copied to clipboard

openlineage: migrate OpenLineage provider to V2 facets.

Open JDarDagran opened this issue 1 year ago • 19 comments

In https://github.com/OpenLineage/OpenLineage/pull/2520 released in openlineage-python==1.13.1 there were introduced V2 classes for facets and events.

JDarDagran avatar May 09 '24 18:05 JDarDagran

I ran breeze testing tests --upgrade-boto tests/providers/docker in main branch with latest CI image - getting same error results. Celery integration test seems flaky, no?

JDarDagran avatar May 21 '24 21:05 JDarDagran

I ran breeze testing tests --upgrade-boto tests/providers/docker in main branch with latest CI image - getting same error results.

https://github.com/apache/airflow/pull/39747 should fix the docker test in "Latest Botocore".

potiuk avatar May 22 '24 02:05 potiuk

@JDarDagran we probably need to bump cross-provider version dependencies. If someone has older OL provider version, and bumps some of the other provider versions, the openlineage methods will fail.

mobuchowski avatar May 22 '24 14:05 mobuchowski

As discussed with @mobuchowski -> we need to add back-compatibility try/import to past versions of the openlineage provider. We could even potentially add back-compatibility test suite for older version of of openlineage client < 1.13 to test it (symiliar to what we have in Pydantic/Botocore cases.

potiuk avatar May 22 '24 17:05 potiuk

I've added imports with weird-looking structure at first glance:

if TYPE_CHECKING:
    # v2 imports
else:
   try:
       # v2 imports
   except ImportError:
       # v1 imports

This satisfies mypy and works with previous OL provider versions. I'll be adding backwards-compatibility cross-provider checks in separate PR soon.

JDarDagran avatar Jun 03 '24 16:06 JDarDagran

Should we re-start discussion about common provider for that one? I have an idea..

potiuk avatar Jun 08 '24 19:06 potiuk

Discussion at devlist https://lists.apache.org/thread/hjmrv1wc2rxhkvozpvccs4mhgwg5sw2f

potiuk avatar Jun 08 '24 20:06 potiuk

My proposal is to wait for LAZY CONSENSUS on compat provider and implement it (See https://lists.apache.org/thread/hjmrv1wc2rxhkvozpvccs4mhgwg5sw2f)

potiuk avatar Jun 13 '24 14:06 potiuk

yep, waiting for better times :)

JDarDagran avatar Jun 13 '24 19:06 JDarDagran

Consensus thread started ...

potiuk avatar Jun 17 '24 09:06 potiuk

PR to add common.compat here https://github.com/apache/airflow/pull/40374 @kacpermuda @JDarDagran

potiuk avatar Jun 21 '24 15:06 potiuk

Hm, openlineage-python client minimal version is currently 1.16.0: https://github.com/apache/airflow/blob/ed57711866527d4e56ce195944b26607317b1f26/airflow/providers/openlineage/provider.yaml#L49

Why need adding these try: ... except: for handling versions <1.13.1?

dolfinus avatar Jun 21 '24 19:06 dolfinus

Hm, openlineage-python client minimal version is currently 1.16.0:

https://github.com/apache/airflow/blob/ed57711866527d4e56ce195944b26607317b1f26/airflow/providers/openlineage/provider.yaml#L49

Why need adding these try: ... except: for handling versions <1.13.1?

Because providers (for example AWS) might be installed on older airflow versions AND with older openlineage provider version.

potiuk avatar Jun 21 '24 19:06 potiuk

You have to remember that what you see now in main is the LATEST set of all providers and latest airflow. This is why we have compatibility and lowest-direct dependencies tests to test current latest code assuming that older versions of airlfow and older versions of providers might be installed.

potiuk avatar Jun 21 '24 19:06 potiuk

to test current latest code assuming that older versions of airlfow and older versions of providers might be installed

How is this related to adding a bunch of try..except for missing imports? I don't get it.

dolfinus avatar Jun 21 '24 19:06 dolfinus

pip install apache-airflow==2.7.0 apache-airflow-providers-openlineage==1.5.0 apache-airflow-providers-amazon==NEXT_VERSION

And then try without try ... except

potiuk avatar Jun 21 '24 19:06 potiuk

Ok, got it, thanks for clarifying!

dolfinus avatar Jun 21 '24 19:06 dolfinus

And https://github.com/apache/airflow/pull/40374 merged. You can rebase and move the code there (and add dependency on apache-airflow-providers-common-compat

potiuk avatar Jun 22 '24 08:06 potiuk

@JDarDagran can you rebase?

eladkal avatar Jun 30 '24 05:06 eladkal

@eladkal I'll be working on moving part of the code to common.compat provider

JDarDagran avatar Jul 04 '24 09:07 JDarDagran

@eladkal I'll be working on moving part of the code to common.compat provider

Cool :)

potiuk avatar Jul 04 '24 09:07 potiuk

almost :)

potiuk avatar Jul 18 '24 11:07 potiuk

I think I need to drop tests checking what is imported with different availability of OL client. mocking imports breaks other tests when running in parallel.

JDarDagran avatar Jul 18 '24 12:07 JDarDagran

I think I need to drop tests checking what is imported with different availability of OL client. mocking imports breaks other tests when running in parallel.

The OTEL "lowest dependencies" test is - quite surprisingly caused by packaging version - I have a PR for that https://github.com/apache/airflow/pull/40865 - but it might have some side-effects so I am waiting for it to complete.

potiuk avatar Jul 18 '24 14:07 potiuk

I found it by bisecting downgraded packages BTW.

potiuk avatar Jul 18 '24 14:07 potiuk

I think I need to drop tests checking what is imported with different availability of OL client. mocking imports breaks other tests when running in parallel.

The OTEL "lowest dependencies" test is - quite surprisingly caused by packaging version - I have a PR for that #40865 - but it might have some side-effects so I am waiting for it to complete.

That's great to have it fixed. Still, I meant different way of tests breaking caused by mocking builtin import - which I removed eventually. Now you can see it's almost green.

JDarDagran avatar Jul 18 '24 14:07 JDarDagran

Let's see -rebased to take latest deps into account :crossed_fingers:

potiuk avatar Jul 18 '24 14:07 potiuk

Last one seems flaky?

JDarDagran avatar Jul 18 '24 16:07 JDarDagran

@potiuk looks like I'm on green path finally consistently 🟢 Would love to get this merged soon as new features arrive and I still need to rebase and apply new imports :)

JDarDagran avatar Jul 23 '24 12:07 JDarDagran