docker-airflow icon indicating copy to clipboard operation
docker-airflow copied to clipboard

How to enable RBAC and add users?

Open infused-kim opened this issue 6 years ago • 47 comments

Hi,

Thanks for the quick update to 1.10.0!

One of the most exciting features is RBAC support that allows to add users with different roles and permissions.

I tried following the instructions on... https://wecode.wepay.com/posts/improving-airflow-ui-security https://github.com/apache/incubator-airflow/blob/master/UPDATING.md

But I wasn't able to add users to the database and get it working.

I added

        environment:
            - AIRFLOW__WEBSERVER__RBAC=true

Which successfully enabled RBAC and showed the login screen on the webserver.

Next I opened a bash shell and ran the webserver to generate the /usr/local/airflow/airflow/webserver_config.py

docker run --rm -ti puckel/docker-airflow bash
airflow webserver

It was created successfully and I didn't modify it as want to use the default AUTH_TYPE = AUTH_DB.

Then I ran

airflow initdb
airflow create_user -r Admin -u admin -e [email protected] -f admin -l user -p test

I get the output:

[2018-08-29 20:33:39,664] {{settings.py:174}} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800
[2018-08-29 20:33:40,512] {{__init__.py:51}} INFO - Using executor LocalExecutor
/usr/local/lib/python3.6/site-packages/flask_sqlalchemy/__init__.py:800: UserWarning: SQLALCHEMY_TRACK_MODIFICATIONS adds significant overhead and will be disabled by default in the future.  Set it to True to suppress this warning.
  warnings.warn('SQLALCHEMY_TRACK_MODIFICATIONS adds significant overhead and will be disabled by default in the future.  Set it to True to suppress this warning.')
[2018-08-29 20:33:41,331] {{manager.py:525}} WARNING - No user yet created, use fabmanager command to do it.
[2018-08-29 20:33:43,393] {{models.py:258}} INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-08-29 20:33:43,944] {{base_hook.py:83}} INFO - Using connection to: XXX
[2018-08-29 20:33:44,166] {{base_hook.py:83}} INFO - Using connection to: XXX
[2018-08-29 20:33:44,299] {{base_hook.py:83}} INFO - Using connection to: XXX
Admin user admin created.

But the user is not created. I checked the postgres database and I can see there are several ab_ tables there, but they are all empty.

And when I re-run the user_add command I also get the same success message, whereas if the user existed it should say admin already exist in the db according to the cli.py airflow code.

Airflow also constantly logs to the console:

{{manager.py:525}} WARNING - No user yet created, use fabmanager command to do it.

So I've tried running fabmanager, but it always quits with the error Was unable to import app Error: No module named 'app' and even when I tried running it from the app's dir (/usr/local/lib/python3.6/site-packages/airflow/www_rbac) it failed with a different exception.

I have also tried preserving webserver_config.py by adding it to my volumes:

volumes:
            - ./volumes/airflow_config:/usr/local/airflow/airflow/:z

But this didn't make a difference either.

Another thing I noticed is that I tried enabling the option AUTH_USER_REGISTRATION = True. It's supposed to allow self-registration of users. I am not sure how this feature works, but I expected to see an option on the login screen... yet nothing showed up.

So perhaps something is broken and the config isn't used at all?

I can't think of anything else to try for now. Hopefully one of you guys has an idea.

infused-kim avatar Aug 29 '18 20:08 infused-kim

Just spent an hour trying to solve this.

The problem here was, it wasn't reading my webserver_config.py file, and that was failing silently, in app.py:

    app.config.from_pyfile(webserver_config_path, silent=True)

Without reading that file, sqlalchemy is defaulting to sqlite, which is why no user is being inserted into the postgres DB.

So I suggest disabling silent=False so you can see which file it's failing to read.

bcb avatar Sep 07 '18 07:09 bcb

Awesome, thank you!

The problem was

FileNotFoundError: [Errno 2] Unable to load configuration file (No such file or directory): '/usr/local/airflow/webserver_config.py'

Airflow was creating the webserver_config.py in /usr/local/airflow/airflow/webserver_config.py, but was expecting it in /usr/local/airflow/webserver_config.py.

So, to get it running, one has to...

  1. Add - AIRFLOW__WEBSERVER__RBAC=true to the env var section of docker-compose.yml
  2. Create a webserver_config.py in a file that will be mounted as a volume (see below for the default).
  3. Add the webserver_config.py as a volume:
        volumes:
            - ./volumes/airflow_config/webserver_config.py:/usr/local/airflow/webserver_config.py:z
  1. Start the airflow docker containers to initialize the DB. The web interface should ask for a login.
  2. Create the user airflow create_user -r Admin -u admin -e [email protected] -f admin -l user -p test
  3. Run the container again. You should be able to login as admin now.

Default webserver_config.py:

# -*- coding: utf-8 -*-
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

import os
from airflow import configuration as conf
from flask_appbuilder.security.manager import AUTH_DB
# from flask_appbuilder.security.manager import AUTH_LDAP
# from flask_appbuilder.security.manager import AUTH_OAUTH
# from flask_appbuilder.security.manager import AUTH_OID
# from flask_appbuilder.security.manager import AUTH_REMOTE_USER
basedir = os.path.abspath(os.path.dirname(__file__))

# The SQLAlchemy connection string.
SQLALCHEMY_DATABASE_URI = conf.get('core', 'SQL_ALCHEMY_CONN')

# Flask-WTF flag for CSRF
CSRF_ENABLED = True

# ----------------------------------------------------
# AUTHENTICATION CONFIG
# ----------------------------------------------------
# For details on how to set up each of the following authentication, see
# http://flask-appbuilder.readthedocs.io/en/latest/security.html# authentication-methods
# for details.

# The authentication type
# AUTH_OID : Is for OpenID
# AUTH_DB : Is for database
# AUTH_LDAP : Is for LDAP
# AUTH_REMOTE_USER : Is for using REMOTE_USER from web server
# AUTH_OAUTH : Is for OAuth
AUTH_TYPE = AUTH_DB

# Uncomment to setup Full admin role name
# AUTH_ROLE_ADMIN = 'Admin'

# Uncomment to setup Public role name, no authentication needed
# AUTH_ROLE_PUBLIC = 'Public'

# Will allow user self registration
# AUTH_USER_REGISTRATION = True

# The default user self registration role
# AUTH_USER_REGISTRATION_ROLE = "Public"

# When using OAuth Auth, uncomment to setup provider(s) info
# Google OAuth example:
# OAUTH_PROVIDERS = [{
# 	'name':'google',
#     'whitelist': ['@YOU_COMPANY_DOMAIN'],  # optional
#     'token_key':'access_token',
#     'icon':'fa-google',
#         'remote_app': {
#             'base_url':'https://www.googleapis.com/oauth2/v2/',
#             'request_token_params':{
#                 'scope': 'email profile'
#             },
#             'access_token_url':'https://accounts.google.com/o/oauth2/token',
#             'authorize_url':'https://accounts.google.com/o/oauth2/auth',
#             'request_token_url': None,
#             'consumer_key': CONSUMER_KEY,
#             'consumer_secret': SECRET_KEY,
#         }
# }]

# When using LDAP Auth, setup the ldap server
# AUTH_LDAP_SERVER = "ldap://ldapserver.new"

# When using OpenID Auth, uncomment to setup OpenID providers.
# example for OpenID authentication
# OPENID_PROVIDERS = [
#    { 'name': 'Yahoo', 'url': 'https://me.yahoo.com' },
#    { 'name': 'AOL', 'url': 'http://openid.aol.com/<username>' },
#    { 'name': 'Flickr', 'url': 'http://www.flickr.com/<username>' },
#    { 'name': 'MyOpenID', 'url': 'https://www.myopenid.com' }]

infused-kim avatar Sep 08 '18 14:09 infused-kim

@bcb do you have issues with gunicorn workers starting up after enabling RBAC?

Airflow starts the workers, but they fail to start up in time. They are then killed and the process repeats itself until it finally succeeds. That usually takes 10-20min.

I disabled all volumes to make sure it is nothing with my config, but that had no effect.

The odd thing is that it works on my local machine, but fails on my digital ocean server...

Does anyone have an idea how to even start debugging this?

[2018-09-09 09:05:26 +0000] [398] [INFO] Starting gunicorn 19.9.0
[2018-09-09 09:05:26 +0000] [398] [INFO] Listening at: http://0.0.0.0:8080 (398)
[2018-09-09 09:05:26 +0000] [398] [INFO] Using worker: sync
[2018-09-09 09:05:26 +0000] [403] [INFO] Booting worker with pid: 403
[2018-09-09 09:05:26 +0000] [404] [INFO] Booting worker with pid: 404
[2018-09-09 09:05:26 +0000] [405] [INFO] Booting worker with pid: 405
[2018-09-09 09:05:26 +0000] [406] [INFO] Booting worker with pid: 406
[2018-09-09 09:05:27,388] {{cli.py:717}} DEBUG - [0 / 4] some workers are starting up, waiting...
[2018-09-09 09:05:28,951] {{cli.py:717}} DEBUG - [0 / 4] some workers are starting up, waiting...
[2018-09-09 09:05:30,449] {{cli.py:717}} DEBUG - [0 / 4] some workers are starting up, waiting...

# [...] This keeps repeating for a long time

[2018-09-09 09:07:25,967] {{cli.py:717}} DEBUG - [0 / 4] some workers are starting up, waiting...
[2018-09-09 09:07:27 +0000] [398] [CRITICAL] WORKER TIMEOUT (pid:403)
[2018-09-09 09:07:27 +0000] [398] [CRITICAL] WORKER TIMEOUT (pid:404)
[2018-09-09 09:07:27 +0000] [398] [CRITICAL] WORKER TIMEOUT (pid:405)
[2018-09-09 09:07:27 +0000] [403] [INFO] Worker exiting (pid: 403)
[2018-09-09 09:07:27 +0000] [404] [INFO] Worker exiting (pid: 404)
[2018-09-09 09:07:27 +0000] [398] [CRITICAL] WORKER TIMEOUT (pid:406)
[2018-09-09 09:07:27 +0000] [405] [INFO] Worker exiting (pid: 405)
[2018-09-09 09:07:27 +0000] [406] [INFO] Worker exiting (pid: 406)
[2018-09-09 09:07:28,058] {{cli.py:717}} DEBUG - [0 / 4] some workers are starting up, waiting...
[2018-09-09 09:07:28 +0000] [441] [INFO] Booting worker with pid: 441
[2018-09-09 09:07:29 +0000] [443] [INFO] Booting worker with pid: 443
[2018-09-09 09:07:29 +0000] [444] [INFO] Booting worker with pid: 444
[2018-09-09 09:07:29 +0000] [445] [INFO] Booting worker with pid: 445

infused-kim avatar Sep 09 '18 10:09 infused-kim

I was able to solve this issue...

For some reason the RBAC interface uses a lot of CPU on startup. If you are running on a low powered server, this can cause a very slow webserver startup and permanently high CPU usage.

I have documented this bug as AIRFLOW-3037. To solve it you can adjust the config:

AIRFLOW__WEBSERVER__WORKERS=2 # 2 * NUM_CPU_CORES + 1
AIRFLOW__WEBSERVER__WORKER_REFRESH_INTERVAL=1800 # Restart workers every 30min instead of 30seconds
AIRFLOW__WEBSERVER__WEB_SERVER_WORKER_TIMEOUT=300 #Kill workers if they don't start within 5min instead of 2min

infused-kim avatar Sep 11 '18 10:09 infused-kim

Thanks @KimchaC for your inputs.

By the way how did you do to locate the problème with webserver_config.py and stuff ?

Anyone trid to filter dogs by owner using RBAC ?

cmourouvin avatar Sep 12 '18 10:09 cmourouvin

@cmourouvin it was @bcb's comment. I made the silent=False change in the app.py and the exception showed that it loaded the file from the wrong location.

I haven't tried filtering dag.

infused-kim avatar Sep 12 '18 10:09 infused-kim

By the way the bad file location is reported to airflow bug ? It should be to avoir some workaround :\

cmourouvin avatar Sep 12 '18 14:09 cmourouvin

I haven't reported it, because I'm not sure if it is a bug in all airflow installs or just in this docker file.

If it was a bug in airflow itself, I am sure someone would have noticed. So I suspect it is a bug in this image.

infused-kim avatar Sep 13 '18 14:09 infused-kim

I followed these steps:

  1. Now I have webserver_config.py in AIRFLOW HOME with AUTH_TYPE = AUTH_DB
  2. I see that perm tables have been created in postgres database.
  3. I manually created a user (airflow create_user -r Admin -u admin -e [email protected] -f admin -l user -p test) and restarted the container
screen shot 2018-10-04 at 7 18 17 pm

Log when I created the user:

airflow@8deffa830a77:~$ airflow create_user -r Admin -u admin -e [email protected] -f admin -l user -p test
[2018-10-05 02:54:10,593] {{settings.py:174}} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800
[2018-10-05 02:54:10,791] {{__init__.py:51}} INFO - Using executor LocalExecutor
/usr/local/lib/python3.6/site-packages/flask_sqlalchemy/__init__.py:800: UserWarning: SQLALCHEMY_TRACK_MODIFICATIONS adds significant overhead and will be disabled by default in the future.  Set it to True to suppress this warning.
  warnings.warn('SQLALCHEMY_TRACK_MODIFICATIONS adds significant overhead and will be disabled by default in the future.  Set it to True to suppress this warning.')
[2018-10-05 02:54:11,093] {{manager.py:525}} WARNING - No user yet created, use fabmanager command to do it.
[2018-10-05 02:54:11,519] {{models.py:258}} INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-10-05 02:54:11,663] {{example_kubernetes_operator.py:54}} WARNING - Could not import KubernetesPodOperator: No module named 'kubernetes'
[2018-10-05 02:54:11,663] {{example_kubernetes_operator.py:55}} WARNING - Install kubernetes dependencies with:     pip install airflow['kubernetes']
Admin user admin created.

but I get this error when I hit webserver:

screen shot 2018-10-04 at 7 15 39 pm

sid88in avatar Oct 05 '18 02:10 sid88in

I do see https://github.com/apache/incubator-airflow/pull/3937 got merged 20 hours ago, not sure if this is related or if I am missing something.

sid88in avatar Oct 05 '18 03:10 sid88in

Ok, it worked when I manually decremented the version of flask-appbuilder in my docker file -

&& pip install 'flask-appbuilder==1.11.1' \

sid88in avatar Oct 05 '18 03:10 sid88in

Looking at the Airflow source code, this issue can be resolved by explicitly setting the AIRFLOW_HOME environment variable in your Docker.

ARG AIRFLOW_HOME=/usr/local/airflow ENV AIRFLOW_HOME=${AIRFLOW_HOME}

If this environment variable is not set, AIRFLOW_HOME will default to /usr/local/airflow/airflow. This "bug"/feature only seems to effect the RBAC code.

ghost avatar Oct 19 '18 11:10 ghost

Awesome, looks like that would solve it.

@puckel would you consider adding it to your Dockerfile so we don't have to do it in our's?

infused-kim avatar Oct 20 '18 04:10 infused-kim

Awesome, looks like that would solve it.

@puckel would you consider adding it to your Dockerfile so we don't have to do it in our's?

@puckel Adding the environment variable AIRFLOW_HOME, would also solve a few other issues with the current Docker build. In your default build I can't for example run airflow connection, airflow resetdb or airflow initdb as they also don't pick up on the correct configuration file.

ghost avatar Oct 22 '18 08:10 ghost

Ok, it worked when I manually decremented the version of flask-appbuilder in my docker file -

&& pip install 'flask-appbuilder==1.11.1' \

Further, it gave me the error 'Blueprint' object has no attribute 'json_encoder'. Downgraded flask too to 0.12.1 (I was having 0.12.2) and it worked for me. (i.e with flask-appbuilder==1.11.1 and flask==0.12.1)
See Issue: https://github.com/dpgaspar/Flask-AppBuilder/issues/741 (Note: it's closed now, apparently fixed in 0.12.4)

cotigao avatar Nov 19 '18 10:11 cotigao

@KimchaC thanks for taking the time to write all of this up, you saved me!

JelmerOffenberg avatar Dec 19 '18 13:12 JelmerOffenberg

when I run create_user, LocalExecutor is changed to Sequential

airflow@ac85221fed9f:~$ airflow create_user -r Admin -u test -e [email protected] -f test -l test -p test [2019-04-10 07:21:44,605] {{init.py:51}} INFO - Using executor SequentialExecutor [2019-04-10 07:21:44,847] {{cli_action_loggers.py:69}} ERROR - Failed on pre-execution callback using <function default_action_log at 0x7f5b566910d0>

UPDATE The issue is caused when you run any airflow command by open bash directly.

Use entrypoint.sh instead.

docker exec -it <container_id> /entrypoint.sh bash

siddardha7 avatar Apr 10 '19 07:04 siddardha7

I had to change a few things before I managed to make this work; the gist of it was that the database connection was not set properly:

  1. Login into your container docker exec -it [container id] /bin/bash
  2. Run airflow resetdb and confirm if DB: postgresql+psycopg2://airflow:***@postgres/airflow
  • If that is the case, then your problem is different than mine. If it says DB: sqllite, then read on.
  1. First make sure you've properly set AIRFLOW_HOME variable.
  1. Set SQLAlchemy connection as an environmental variable in the docker yaml file
  • AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
  1. Now stop your container and run it again and repeat steps one and two to check what is your DB. If it is DB: postgresql+psycopg2://airflow:***@postgres/airflow, then you've properly setup your DB. Read on.
  2. Run airflow initdb to setup your persistent database.

At this point you can continue with airflow create_user and it should be saved inside postgres

Note: I'm not sure if step 3 is always necessary, as the default webserver config file will be picked up anyways [or should be at least] and the SQL connection for FAB is set there from what you set in step 4

houmanMP avatar Apr 21 '19 00:04 houmanMP

This comment chain was legendary. I got everything working and eventually created a new docker image that is very much influenced by puckel.docker-airflow and incorporates all the comments here. If you want to use my image as a quick start, go to: https://hub.docker.com/r/mannharleen/airflow

mannharleen avatar May 03 '19 03:05 mannharleen

@KimchaC Have you encountered this problem https://issues.apache.org/jira/browse/AIRFLOW-3442 when working with google OAuth ?

OmerJog avatar May 21 '19 10:05 OmerJog

@OmerJog Can you tell me how Google OAuth works with the RBAC? When someone logs in for the first time, his username/e-mail will be written into the database? How does it work rolewise? Will their roles default to Viewer until an Admin changes it?

holypriest avatar Jun 07 '19 19:06 holypriest

@holypriest If you leave the AUTH_USER_REGISTRATION = True option the user can create their own access and their role will be defined by: AUTH_USER_REGISTRATION_ROLE = "Viewer" - I had some problems in redirecting the login with Public permission, so I switched to Viewer.

The user will be inserted in the database after completing his / her registration, with name, surname and email.

jonathanabila avatar Jun 18 '19 19:06 jonathanabila

I was able to get everything working by following this chain. Running docker exec -it <container_id> /entrypoint.sh bash helped a lot.. I was so confused for hours (days) before this. I added a create user script in my entrypoint file, and I also added a method to read Docker Secrets rather than hard coding the SQLAlchemy URI or any passwords into a file on my gitlab repo.

ldacey avatar Jul 10 '19 02:07 ldacey

I'm trying to integrate Google OAuth and RBAC. Ideally I want users to not be able to see anything when they first go to the webserver, but only have an option to sign in with Google. Then I want them to only have the 'View' role. This is going well so far, but I don't know how I can modify a user's role (i.e make them an Admin) when they have signed in/registered with Google. I can see the users in the database with select * from ab_user;:

 id | first_name | last_name |           username           |                                            password                                            | active |        email         |         last_login         | login_count | fail_login_count |         created_on         |         changed_on         | created_by_fk | changed_by_fk
----+------------+-----------+------------------------------+------------------------------------------------------------------------------------------------+--------+----------------------+----------------------------+-------------+------------------+----------------------------+----------------------------+---------------+---------------
  1 | Foo        | Bar    | google_123456789 | pbkdf2:sha256:adfdfdf$asdfad | t      | [email protected] | 2019-11-08 21:15:11.386942 |           1 |                0 | 2019-11-08 21:15:11.371394 | 2019-11-08 21:15:11.371431 |               |

Anyone know how to change a user's role? I assume I'm gonna have to use fabmanager. Or any other ideas on how to make the initial user an Admin, then the following users User using google oauth?

onprema avatar Nov 08 '19 21:11 onprema

I found a solution...

Have Google authorized users start with the View role by setting AUTH_USER_REGISTRATION_ROLE = 'Viewer'

If you want to elevate a user's privileges you can connect to your airflow db and manually update the ab_user_role table. For example,

UPDATE ab_user_role SET role_id = 1 WHERE id = 1;

would set user with id 1 to have the 'Admin' role (role_id = 1).

You can get the role ids by doing select * from ab_role

onprema avatar Nov 14 '19 21:11 onprema

;tldr - summary of how to use rbac+google_oauth+optional env vars for all configs

For future reference for anyone looking at this issue, and has trouble compiling the pieces together, These are the configuration you must setup under webserver_config.py and airflow.cfg, in order to enable 'google' authentication for rbac, they have worked for me, after several iterations of trial and error:

webserver_config.py:

  • auth_role_admin=Admin

  • auth_role_public=Public

  • oauth_providers=[{"name":"google","whitelist":["@domain.com"],"token_key":"access_token","icon":"fa-google","remote_app":{"base_url":"https://www.googleapis.com/oauth2/v2/","request_token_params":{"scope":"email profile"},"access_token_url":"https://accounts.google.com/o/oauth2/token","authorize_url":"https://accounts.google.com/o/oauth2/auth","request_token_url":null,"consumer_key":"CONSUMER_KEY","consumer_secret":"CONSUMER_SECRET"}}] - prettify this yourselves for readablity

  • auth_type=AUTH_OAUTH

airflow.cfg [webserver]:

  • rbac=true

  • authenticate=true

in order to change the first registered user into an admin user, follow 'eightlimbed' comment, set the 'role_id' of your user to 1 on the database of you running for airflow, which by default should be the 'Admin' role

also, on google's api credentials console: when creating \ updating OAuth 2.0 client, allow these 'Authorized redirect URIs'

https://your-airflow.domain.com/oauth-authorized/login https://your-airflow.domain.com/oauth-authorized/google

I am not sure both are required [one of them for sure is], but from my testings and errors I've seen, both getting requests from either the browser or google..

caveats:

  • notice 'whitelist' - that is a shortcut for flask-appbuilder to set google 'hd' - hosted domain parameter that is an optional parameter which helps to limit the scope of users allowed to registers to a specific domain

  • any configuration under airflow.cfg supports overrides using environment variables BUT, webserver_config.py DOES NOT support any overrides via environment variables, keep that in mind!! Also in that regard I have made a fix to allow most of webserver_config.py configurations to be overridden from env vars on my fork (series of commits) on specific branch: https://github.com/asaf400/airflow/commits/allow_fab_environment_variables and then you should replaced the installation method of airflow for this image to use my branch - or a fork of it: pip install git+git://github.com/asaf400/airflow@allow_fab_environment_variables#egg=apache-airflow[google_auth,crypto,mysql,password,${AIRFLOW_DEPS:+,}${AIRFLOW_DEPS}] but building airflow also requires nodejs to be installed, so you also need to add that to the Dockerfile before installing airflow: apt-get install curl gnupg2 -yqq --no-install-recommends && curl -sL https://deb.nodesource.com/setup_10.x | bash - && apt-get install -yqq --no-install-recommends nodejs, and this after installing airflow: cd /usr/local/lib/python3.7/site-packages \ && airflow/www_rbac/compile_assets.sh then you can use replace dynamically any setting under webserver_config.py, with a special treatment for 'oauth_providers' which should be passed to env-vars as a base64 encoded string of the json array object which is then decoded and converted to python dict..

hopefully this also helps someone who is looking to fork this docker image, and use custom made airflow branch\version..

Credits on this guide comment is to everyone who already commented on this issue, Thank you all!

edits: formatting

asaf400 avatar Jan 02 '20 17:01 asaf400

@asaf400 Have you faced https://issues.apache.org/jira/browse/AIRFLOW-5462? I have enabled rbac+google_oauth and now when I click on login button it gets redirected to https://your-airflow.domain.com/oauth-authorized/login, I get this error

image

ayush-san avatar Jan 17 '20 11:01 ayush-san

@ayush-san Yes I have faced this.. I believe I fixed this by using the proposed fix on the issue you have found, however I do remember that that was a typo where they pinned it to <=2.0.0 and it should be just <2.0.0 .. You can see my commit on my own fork and branch here: https://github.com/asaf400/airflow/commit/b9b19cd9478bae8893cb13e21f95a2bb6dc83f63

asaf400 avatar Jan 18 '20 16:01 asaf400

I am using flask-appbuilder~=2.2, but still facing this issue. I think it's due to the fact that the callback URI is oauth-authorized/login instead of oauth-authorized/google. How did you fix this issue?

ayush-san avatar Jan 19 '20 04:01 ayush-san

@ayush-san Ahh, I see, I missed you were using 3.7 (as do I currently) I don't actually remember how I fixed this, Let me get back to you on that..

I'll verify I didn't change any code that is related to that.. I remember I was debugging airflow in pycharm in order to understand why the 'login' key was missing

edit: typo

asaf400 avatar Jan 24 '20 16:01 asaf400

@ayush-san Check out this issue: https://github.com/apache/incubator-superset/issues/7739

specifically these comments: https://github.com/apache/incubator-superset/issues/7739#issuecomment-567769082 https://github.com/apache/incubator-superset/issues/7739#issuecomment-555971509 https://github.com/apache/incubator-superset/issues/7739#issuecomment-575556081

Also, try to completely DROP all the tables, Airflow's SQL Alchemy lib should replace all the default (non-rbac) user tables with rbac user tables, BUT in my attempts, I have had an issue where it failed to replace some table and got stuck in some weird mid-state, where the ui would work and not crash, but login failed, and deleting all the tables, ensures that the first 'boot' will create the entire valid db schema airflow rbac wants..

I think I have managed to solve my issue with the help of those comments and DROPing the db (It was possible since my airflow is new, so nothing that requires db persistence was saved in the environment, if you cant drop the tables under the database, maybe try another instance, or just ask airflow to connect to a different database on the same instance..)

Hope that helps you, Report back if you succeed..

asaf400 avatar Jan 26 '20 10:01 asaf400

There was an issue with my webserver_config.py which was causing this issue. We only need to whitelist https://your-airflow.domain.com/oauth-authorized/google in the google console UI. I think there is an issue in flask_appbuilder view.py as it redirects error to https://your-airflow.domain.com/oauth-authorized/login instead of https://your-airflow.domain.com/login

Since my webserver_config has whitelisted a wrong domain I was getting the flash message of You are not authorized and was redirected to to https://your-airflow.domain.com/oauth-authorized/login instead of https://your-airflow.domain.com/login

https://github.com/dpgaspar/Flask-AppBuilder/issues/1254

ayush-san avatar Jan 27 '20 12:01 ayush-san

@ayush-san Yeah, the docs really are awful about some stuff, that may well be what I encountered and I fixed it same way you did, without being able to recollect that it was what helped me..

asaf400 avatar Jan 27 '20 12:01 asaf400

@ayush-san @asaf400 Has anyone succeeded for Ldap using the same rbac+ldap authentication.I am unable to succeed. I have placed webserver_config.py I see my file is read coz I cud see the print message in logs Ldap settings shud be fine coz the same settings works fine when I am using plain ldap without rbac. There is no message in console.I was expecting that when I enter a userid/password for an AD user It will create the user at the backend in the table. Neither I get error message neither I see any thing in console.It just displays the login page again.Any help is appreciated!

anpjai avatar Jan 27 '20 14:01 anpjai

@anpjai I cant help with with LDAP since I didn't do it myself

asaf400 avatar Jan 27 '20 15:01 asaf400

After using docker exec -it [container id] /bin/bash to enter into airflow webserver container.

Run python airflow_create_user.py to create user, it seems worked but the user hadn't been created.

airflow@3abeb2aa6681:~$ python airflow_create_user.py
[2020-03-09 01:42:03,001] {{settings.py:253}} INFO - settings.configure_orm(): Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=38599

Anyone could help and explain this?

#! /usr/bin/env python
# -*- coding: utf-8 -*-

from airflow import models, settings
from airflow.contrib.auth.backends.password_auth import PasswordUser

user = PasswordUser(models.User())
user.username = 'admin'
user.email = '[email protected]'
user.password = 'test'
user.firstname = 'admin'
user.lastname = 'test'

session = settings.Session()
session.add(user)
session.commit()
session.close()

@KimchaC @bcb

benbendemo avatar Mar 09 '20 01:03 benbendemo

@benbendemo, at least for me the create_user didn't know about the postres connection db, user and password. I'm using this code in a re-packed image

import airflow
from airflow import models, settings
from airflow.contrib.auth.backends.password_auth import PasswordUser
from sqlalchemy import create_engine
from os import getenv

POSTGRES_USER = getenv('POSTGRES_USER')
POSTGRES_PASSWORD = getenv('POSTGRES_PASSWORD')
POSTGRES_DB = getenv('POSTGRES_DB')

user = PasswordUser(models.User())
user.username = 'admin'
user.email = '[email protected]'
user.password = 'test'
user.firstname = 'admin'
user.lastname = 'test'

engine = create_engine(f'postgresql://{POSTGRES_USER}:{POSTGRES_PASSWORD}@postgres:5432/{POSTGRES_DB}')
session = settings.Session(bind=engine)
session.add(user)
session.commit()
session.close()

Apparently the create user script has lost the context and doesn't know how to connect to the DB

dinigo avatar Mar 09 '20 12:03 dinigo

@benbendemo, at least for me the create_user didn't know about the postres connection db, user and password. I'm using this code in a re-packed image

import airflow
from airflow import models, settings
from airflow.contrib.auth.backends.password_auth import PasswordUser
from sqlalchemy import create_engine
from os import getenv

POSTGRES_USER = getenv('POSTGRES_USER')
POSTGRES_PASSWORD = getenv('POSTGRES_PASSWORD')
POSTGRES_DB = getenv('POSTGRES_DB')

user = PasswordUser(models.User())
user.username = 'admin'
user.email = '[email protected]'
user.password = 'test'
user.firstname = 'admin'
user.lastname = 'test'

engine = create_engine(f'postgresql://{POSTGRES_USER}:{POSTGRES_PASSWORD}@postgres:5432/{POSTGRES_DB}')
session = settings.Session(bind=engine)
session.add(user)
session.commit()
session.close()

Apparently the create user script has lost the context and doesn't know how to connect to the DB

Thank for your advice, i've tried your way but still in vain. From my test result, i could connect Postgres and query results from the db with the method you provided.

However creating new user didn't take into effect, i even tried the root user in airflow container. Guess probably it's due to Postgres's access control or something.

benbendemo avatar Mar 10 '20 08:03 benbendemo

@dinigo @benbendemo

Thanks for your comments! After I changed Postgres to MySQL, I could login the Airflow webserver with the user I created.

airflow create_user --lastname test --firstname test --username test --email [email protected] --role Admin --password test

harry-cai avatar Apr 30 '20 03:04 harry-cai

@benbendemo, use command docker container exec -it [container_id] /entrypoint.sh bash instead of docker exec -it [container id] /bin/bash see (https://github.com/puckel/docker-airflow/issues/225#issuecomment-481568175)

mjmhall avatar Apr 30 '20 20:04 mjmhall

I have to say that, ever since I updated the image to the official iv'e been using some sort of what @harry-cai says.

You have a ready installation of airflow within the docker container. So you run a shell from within and run this command.

You can also do this form local installation (outside the container) if you have airflow installed and declared the environment variable AIRFLOW__CORE__ACHEMY_SQL_CONN to point to your database.

Theres an open Issue about this in the official airflow repo. They are working on providing an easy way of bootstrapping a container

dinigo avatar May 01 '20 17:05 dinigo

Hi,

I enabled RBAC for airlfow 1.10.10, I am able to create user with 'airflow create-user' on webserver container bash. When I login to http://localhost:8080 with username and pwd, Recent Tasks, Last Run and DAG Runs keep spinning, is this bug in UI? How do I fix it?

image

aki1977 avatar May 05 '20 01:05 aki1977

well @aki1977 how did you enable it?

dinigo avatar May 05 '20 08:05 dinigo

I had issues enabling RBAC. After reading this forum, I discovered how to fix it. For those who might have similar issues:

  1. My docker-compose env
- AIRFLOW__WEBSERVER__RBAC=true
- AIRFLOW__WEBSERVER__WORKERS=2 
- AIRFLOW__WEBSERVER__WORKER_REFRESH_INTERVAL=1800 
- AIRFLOW__WEBSERVER__WEB_SERVER_WORKER_TIMEOUT=300
  1. Instead of creating user after initdb|upgradedb. I move it down after the connection are set: image

See. I have a working example here: advance_scraping

My issue was generated by mounting ./volumes/airflow_config/webserver_config.py:/usr/local/airflow/webserver_config.py:z . So I removed that requirement and all is well.

Proteusiq avatar May 19 '20 06:05 Proteusiq

Create a user by using following command : airflow users create
--username admin
--firstname Peter
--lastname Parker
--role Admin
--email [email protected]

IndrajeetTech2020 avatar May 30 '21 05:05 IndrajeetTech2020

tried following this but I still get the "no user yet created" error. Anyone have up to date solutions?

andrewhong5297 avatar Dec 03 '21 04:12 andrewhong5297

This repo is not maintained animore. My personal recommendation is using either a managed solution in prod (Astronomer, GCP composer...) And use 'astro' CLI to manage local environments

dinigo avatar Dec 04 '21 20:12 dinigo