airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Role Management with Unique Dag Owner Implementation

Open bugraoz93 opened this issue 1 year ago • 1 comments

closes: #23638

The implementation provides basic owner management over the user-defined owner field coming from DAGs. The implementation is functional. I tested over multiple cases such as those below.

dag1_owner:team1,team2
dag2_owner:team1
dag3_owner:team3
dag4_owner:team4,team5

user1:team1
user2:team3,team4
user3:team2

It is exactly acting as stated in the issue. We will have resources such as OWNER:team1. It will be generated by unique team names given to the DAGs owner field. It has the same behavior as DAG resources such as DAG:dag1.So that, we can now add permission to a role such as below.

can read on OWNER:team1 
can edit on OWNER:team1
can delete on OWNER:team1

Important Note: It is not ready to be merged. I haven't implemented the unit tests because I want to ensure the approach is good. After deciding on the approach and finalizing the functional implementation, I will implement the unit tests.

bugraoz93 avatar Sep 19 '22 00:09 bugraoz93

I think that one is cool. We missed the "real" usage for owner field and I think it nicely integrates with the permission model of FAB. @jhtimmins @ephraimbuddy @dstandish - I think you were close to those parts :) ?

potiuk avatar Sep 19 '22 13:09 potiuk

Hello @ashb, thanks for your comment. I wrote a bit-long comment, sorry about that. I tried to think loudly to explain my thoughts on the issue and implementation.

Yes, the management is possible over the access_control attribute. Using access_control like in the example is the same thing as adding a DAG resource to a Role (like can read on DAG:dag1). Rather than adding from UI for all DAGs that the team owns (Role:team1), we are adding this information from the DAG and leaving the management to DAG maintainers. You are right, the use case can be managed over the access_control attribute. The feature is already provided. Of course, there are some pros and cons to using this method.

I think It is a matter of how we want to manage the permissions. The use case and implementation are just extending the access_control over the owner attribute. From the administrator side, using the access_control attribute moves the role management to the DAG level. Once the DAG is updated and reloaded to Dagbag, it will always override what is set on UI for the role. This could be a hard thing for the administrators because it needs to manage the roles from DAG rather than UI. From the DAG maintainer perspective, the maintainer can give delete permission to any role with the access_control attribute. This can cause the deletion of important DAGs by mistake. If we use the UI to set the permissions, the situation can be prevented. The use case can reduce the visibility of the role management on DAG.

It may provide elasticity to Airflow administrators and extend the role management over UI (edited: ~~Multi-Tenancy feature~~, sorry if this word mislead someone). I think it could be similar to the filter_by_owner configuration on the previous versions such as 1.10. Since the Owner attributes can be also used for filtering the DAGs, I think this attribute could be more widely used than the access_control attribute.

bugraoz93 avatar Sep 24 '22 09:09 bugraoz93

Hey @jhtimmins, @XD-DENG, and @kaxil, I wrongly removed you as a reviewer from the PR while trying to request a review. Sorry about that. If you have time, please review and make your comments.

bugraoz93 avatar Sep 28 '22 11:09 bugraoz93

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Nov 13 '22 00:11 github-actions[bot]

Hey @ashb and @potiuk, should I continue to the implementation and finalize the PR? Are you okay with the approach of the implementation and idea of having this feature?

bugraoz93 avatar Nov 13 '22 00:11 bugraoz93

I hav no more comments :) - I think there was question from @ashb that you responded to and it waits for Ash's response.

potiuk avatar Nov 13 '22 10:11 potiuk

I'll look at this again tomorrow

ashb avatar Nov 13 '22 11:11 ashb

Sorry no, this one is a -1 for me at least. Owner isn't really designed for this, and I don't like the idea of having two parallel places to define permissions.

This idea though can be built on top of the existing owner field if you want it.

If you can come up with an alternative approach that gives a way of specifying this in the access_control field lets talk.

(To be clear: I'm not saying no to the feature, I'm saying no to this way of achieving it)

In the mean time I am going to close this PR. Feel free to re-open it or create a new PR.

ashb avatar Nov 14 '22 15:11 ashb

Thank you both for your time and comments @ashb and @potiuk!

I will think about a solution. I understood your points. I thought to use the existing Owner field and I had some ideas about it but need to check them again over the code and then I can re-open this PR or create a new PR.

I really want to implement this feature to extend role management over UI (edited: ~~multi-tenancy~~, sorry if this word mislead someone) in Airflow. In order to prevent over-implementation, is it okay to reach out to you (@ashb) on Slack (in the Airflow space) to talk about the approach before implementing it? Or I can create a symbolic PR to discuss the approach? Both are good for me.

Out of Topic: Maybe this is not a good place to write this as well. If that is the case sorry about that and please ignore it. I don't want to waste your time. In the meanwhile, if there are other issues that need to be done maybe in the roadmap or in your minds, I can take them as well and you can forward them to me. I want to be more involved and help Airflow to grow. Thanks!

bugraoz93 avatar Nov 14 '22 16:11 bugraoz93

The best way is to keep track what's happening in AIP-1 - just look through documents, discussions, recordings from past meetings of sig-multi-tenancy group: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=89066609 - linked discussions and AIPs

Also you might want to watch this talk from Airflow Summit: https://airflowsummit.org/sessions/2022/multitenancy-is-coming/ where we discussed the roadmap and what are the long term plans for multi-tenancy.

We are at the verge of starting implementing AIP-44 https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-44+Airflow+Internal+API and likely soon an idea will be discussed on next steps of authorisation of the internal API which is supposed to be follow-up after AIP-44. And likely in a month or so we will have the next meeting about multi-tenancy to discuss the proposals.

Until then, if you have some ideas that fit into what you will see there, you are welcome to start a discussion on devlist - we strongly prefer asynchronous communication on the devlist for those kind of matters - it has to be well thought and structured well. meetings and synchronous Slack discussions are very rare and usually they happen when people work on something together. if you hav a need to exchange ideas and brainstorm, you should rather start thread on #sig-multitenancy channel on slack.

Comment - I believe you underestimate some of hte challenges of multi-tenancy in Airlfow, AIP-44 and follow-up authorisation is really hampering any effort you might think you have multi-tenancy, but in fact this is a "false hope" and actually even worse - because you might give people an impression they have security that multi-tenant requires, but they don't.

I think you should start from reading the docs and state and if you want to take part an effort in it, getting involved in devlist when comments/threads will be started, starting threads when the ideas are hashed out and verified against the current "state of multi-tenancy" and plans but most likely first step will be to take part in the first sig-multitenancy meeting that is going to happen (and will be announced in devlist and in slack #sig-multitenancy channel.

potiuk avatar Nov 14 '22 21:11 potiuk

Thanks a lot, @potiuk for your guidance and for showing me the way of involving in a detailed way. I will do those things as a start and will try to involve myself as much as possible. These explanations were what exactly I was looking for. :)

My intention was not to underestimate the challenges or give people "false hope". If it is seen like that, sorry about that. I am adjusting my comment in order to not give others some false impressions. Thanks for correcting my misleading comment as well!

bugraoz93 avatar Nov 15 '22 09:11 bugraoz93

You are welcome. The "false impression" is just something that might surprise people. There are still some internals to be sorted out before we can really announce "the code dag authors write cannot wreak havoc" but once we get there - we have some more opportunities.

potiuk avatar Nov 16 '22 22:11 potiuk