airflow-provider-datarobot
airflow-provider-datarobot copied to clipboard
Elyra Pipelines Integration failing with wheel file
I am using Airflow 2.4.1 and would be interested, in addition to the basic airflow operators

https://airflow.apache.org/docs/apache-airflow/2.4.1/_api/airflow/operators/index.html#
to also use this provider with all its components in Elyra.
In Elyra Pipelines, there is the possibility to add operators via a concept named "Airflow Provider Package Catalog Connector".
https://medium.com/ibm-data-ai/getting-started-with-apache-airflow-operators-in-elyra-aae882f80c4a
However, when I add the wheel-file download url to the Elyra config, I get the following notice in the jupyterlab elyra container:
[E 2023-01-06 16:41:25.198 ElyraApp] Error. Airflow provider package connector 'DataRobot Operator Components for Airflow' is not configured properly. The archive '/tmp/tmpij9e8m_j/airflow_provider_datarobot-0.0.3-py3-none-any.whl' contains 0 file(s) named 'get_provider_info.py'.
[I 2023-01-06 16:41:27.003 SingleUserLabApp log:189] 200 GET /user/kube%3Aadmin/api/kernels?1673023286993 (kube:[email protected]) 1.77ms
@ptitzler @kiersten-stokes
Can you maybe provide the developer with a hint as to what would need to be changed for integration to work?
It would be great to have all those operators available via Elyra Pipelines.
https://pypi.org/project/airflow-provider-datarobot/#description
Then again, it is possible this does not yet work with providers made for Airflow 2.x, as DataRobot provider is, since for example the Apache Airflow Providers Amazon Wheel file, which has the file get_provider_info.py in its directories, throws yet another error in the Elyra container
[E 2023-01-06 17:10:10.314 SingleUserLabApp component_parser_airflow:56] Content associated with identifier '{'provider_package': 'apache_airflow_providers_amazon-7.0.0-py3-none-any.whl', 'provider': 'apache_airflow_providers_amazon', 'file': 'airflow/providers/amazon/aws/operators/ecs.py'}' could not be parsed: 'Attribute' object has no attribute 'id'. Skipping...
Hi @shalberd,
There have been many Apache Airflow changes since Elyra introduced the provider package connector in version 3.6. The changes to provider package implementations are incompatible and our connector is unable to locate the operator classes in the archive. Any provider created for Airflow releases > 2.2 will therefore likely not work.
@ptitzler are there any plans to change that situation in Elyra, or, in other words, how important within IBM is Airflow still at this point?
@shalberd I am no longer involved with the Elyra project and therefore don't have any insights into future plans. Please reach out to the remaining maintainers via the project's public channels.
Hi @shalberd - I work at DataRobot and we're taking a renewed interest in this repository/package and have actually just released a new version.
We plan on having a relatively frequent cadence of releasing changes in the near future.
Could you possibly attempt again with the latest version as of now 0.1.0 (released Feb. 17th, 2025) and let us know if we can be of any assistance to your reported issue?
@shalberd I'm going to close this issue but if you need or want more assistance please let us know and/or create a new issue.
Thanks!
@c-h-russell-walker Thank you for letting me know about 0.1.0 I will have to do changes in Elyra Jupyterlab plugin for parsing anyways, so good to know. It will be a while until I can give feedback, though. Cool to know you keep developing that provider. I am sure, aside from triggered from within Elyra, the Airflow community appreciates that a lot, in terms of Airflow 2.8.x or Airflow 2.10.x or in general Airflow. As mentioned ... I will keep in touch and reach out this year. The aim is for validated Airflow 2.x support. And then, later this semester, I guess Airflow 3 is on the horizon, too.
Amazing!
And yes not only do we have 0.2.0 as of March 4th but we also now have an early access version that's currently releasing that latest code on Tuesdays if you want to ever check that out.
https://pypi.org/project/airflow-provider-datarobot-early-access/
@c-h-russell-walker what makes my environment special from an architectural perspective: We have in use both Airflow 2.x as well as Open Data Hub Kubeflow notebooks as well as DataRobot (I believe 9). So yes, as mentioned, very neat to see Airflow support continuing in form of a custom provider.