airflow icon indicating copy to clipboard operation
airflow copied to clipboard

apache-airflow-providers-amazon added xmlsec as new dependency and pinned to a version that doesn't have wheels for new python versions

Open phofl opened this issue 1 year ago • 5 comments
trafficstars

Apache Airflow Provider(s)

amazon

Versions of Apache Airflow Providers

apache-airflow-providers-amazon=8.21.0

Apache Airflow version

2.8.3

Operating System

Mac/Ubuntu

Deployment

Other

Deployment details

No response

What happened

The 8.21.0 release added xmlsec as a required dependency to apache-airflow-providers-amazonn and pinned to a version that is 2 years old and doesn't have wheels for

  • anything but windows
  • python 3.11 and higher

and building the wheel fails.

What you think should happen instead

I found https://github.com/apache/airflow/issues/39103 and it looks like the cause of this was an incompatibility. I can't see why adding it as a dependency is necessary instead of just constraining allowed versions

How to reproduce

mamba create -n airflow_test python=3.11
mamba activate airflow_test
pip install apache-airflow-providers-amazon

Anything else

Every time

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

phofl avatar May 06 '24 13:05 phofl

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

boring-cyborg[bot] avatar May 06 '24 13:05 boring-cyborg[bot]

The 8.21.0 release added xmlsec as a required dependency to apache-airflow-providers-amazonn and pinned to a version that is 2 years old and doesn't have wheels for

It is restricted to the version which didn't breaks current installation. As you might find there is open issue for resolve this, if you know how to fix it, feel free to raise a PR.

BTW, many of the packages do not have a wheel and required to built from the sources and appropriate dev packages.

cc: @vincbeck Do you think we need this is as main dependency for Amazon provider, and could we move it in some extra?

Taragolis avatar May 06 '24 15:05 Taragolis

Preferably we should solve the root issue https://github.com/apache/airflow/issues/39103

eladkal avatar May 06 '24 16:05 eladkal

Fixing the root cause sounds good, but currently it is not possible to install the amazon provider with any configuration that has Python >=3.11 or a non Windows machine on all Python Versions.

phofl avatar May 06 '24 19:05 phofl

Just had a look into this and I agree with Taragolis's sentiment here:

BTW, many of the packages do not have a wheel and required to built from the sources and appropriate dev packages.

When I did a test install (same as your steps, create a new virtual env based on 3.11 and install the provider package) I had several packages that needed to be built from source, xmlsec among them:

Building wheels for collected packages: xmlsec, methodtools, python-nvd3, unicodecsv, wirerope

the xmlsec build did fail for me, but only because I was missing required dev packages to build the source. After installing libxmlsec1, libxmlsec1-dev and python3.11-dev my installation worked just fine and was able to build a wheel locally for xmlsec.

Perhaps you are missing some development packages as well on your system which are inhibiting the wheel build from completing?

o-nikolas avatar May 06 '24 22:05 o-nikolas

I think @o-nikolas it's not as straightforward. I described it in https://github.com/apache/airflow/issues/39103 and the problem is that xmlsec python bindings v 1.3.14 expects libxmlsec2 to be installed and used and not libxmlsec1. This might or might not be a problem for security at any point in time and it should be investigated if the dependency can be upgraded (or as @eladkal suggests - turned into an optional dependency and handle lack of xmlsec with optional amazon provider dependency + helpful instructions on what to do). Especially that - as I understand - it's only needed for maybe one integration in Amazon provider.

I am happy to guide anyone who would like to take on that task.

While yes - you can still build the wheel with installing the packages you mentioned, it's a blocker or unnecessary hurdle for people who would like to install amazon provider on anything else than Airflow PROD image - so while we currently have no problems with releasing airflow (because we limitted xmlsec python bindings), it simply makes it more difficult for those who install amazon provider on their own. Mostly up to the Amazon team to decide how to approach it.

potiuk avatar May 07 '24 08:05 potiuk

I think we just put the pin in the wrong spot to be honest.

We should move the xmlsec pin to that extra instead I believe. PR #39528

jedcunningham avatar May 09 '24 16:05 jedcunningham

Nope - it's been added specifically on request and by @vincbeck - https://github.com/apache/airflow/pull/35488 to support authentication for AWS

potiuk avatar May 09 '24 17:05 potiuk

And yes - it's likely a bit complex but It means that libxmlsec1 (system library) that is used and apparently needed by Amazon provider is clashing with llbxml python bindings that are used by python3-saml - because they need libxmlsec2.

So technically speaking python3-saml should not have this limit (it will work with libxmlsec2 if installed) - but amazon needs libxml1 and it clashes with xmlsec python bindings >= 1.3.14 that need libxmlsec2.

potiuk avatar May 09 '24 17:05 potiuk

The whole problem will be solved IF amazon internal authentication (I guess) can use libxmlsec2 . But I have no idea if this is possible - if so - then follow-up to https://github.com/apache/airflow/pull/35488 should be done when we bump libxmlsec1 to 2 in our image and remove the limitation from amazon provlder.

potiuk avatar May 09 '24 17:05 potiuk

Yep, that all makes sense. I'm not suggesting #39528 is the long term fix. This just brings us back to status-quo before the new xmlsec release broke stuff (xmlsec is optional, and tied with the python3-saml extra that uses it).

jedcunningham avatar May 09 '24 17:05 jedcunningham

While yes - you can still build the wheel with installing the packages you mentioned, it's a blocker or unnecessary hurdle for people who would like to install amazon provider on anything else than Airflow PROD image - so while we currently have no problems with releasing airflow (because we limitted xmlsec python bindings), it simply makes it more difficult for those who install amazon provider on their own. Mostly up to the Amazon team to decide how to approach it.

I'm basically this persona. I install the provider, but I do not install the python3-saml additional extra. #39528 just moves the (hopefully) temporary pin to impact people who would actually hit the issue in the first place.

jedcunningham avatar May 09 '24 17:05 jedcunningham

The question is , whether Amazon provider works with libxmlsec2 at all ?

potiuk avatar May 09 '24 17:05 potiuk

Because if it does not, this is not a solution to the problem, it will allow to upgrade xmlsec to 1.3.14, but the specific authentication might (and likely will not) work for Amazon. I guess there was a reason why we installed libxmlsec1 in the first place - from what I know Amazon Linux distro contains only libxmlsec1 (last time I checked this issue)..

potiuk avatar May 09 '24 17:05 potiuk

And yeah. I have no idea what it impact it will have - but if we "solve" it now without investigating and possibly bumping libxmlsec2 it might mask the issue - that's all I have to say here. But I am ok with merging #39528 - If @vincbeck and @feruzzi are fine with potential issue with libxmlsec2 compatibility :)

potiuk avatar May 09 '24 17:05 potiuk

To state it another way, your concern is if folks are installing like this:

apache-airflow-providers-amazon
python3-saml

Not:

apache-airflow-providers-amazom[python3-saml]

it (likely) won't work, right? I feel like constraints cover us pretty well for this situation.

I guess I'll leave it up to the AWS folks to decide. I just don't like having to start installing xmlsec + libxml2 + libxmlsec1 for an optional dependency I don't use - https://github.com/apache/airflow/pull/35488#discussion_r1397646758.

jedcunningham avatar May 09 '24 18:05 jedcunningham

it (likely) won't work, right? I feel like constraints cover us pretty well for this situation.

That's a guess based on the errors I saw 3 weeks ago when I added the limit. As explained - since then maybe something changed, so my proposal is to remove the limit altogether and see if the problem shows up in main.

potiuk avatar May 09 '24 20:05 potiuk

Well I guess lets give it a shot: #39534

jedcunningham avatar May 09 '24 20:05 jedcunningham

Fixed by #39534

potiuk avatar May 10 '24 04:05 potiuk