Cannot configure ldap in Superset 4.1.0 rc2
Bug description
Apparently as part of the database driver purge, ldap drivers have also been removed and have to be reinstalled by the user.
There are currently 2 options, neither of which are ideal:
- Install using pip
- While this works, it requires compiling from source. This requires numerous build environment dependencies and greatly complicates the setup
- Install from system packages (Debian 12)
- The python3-ldap package provided by debian is tied to python 3.11. Since Superset 4.1.0 uses python 3.10, it does not find the library.
It would be much better if Superset restored the necessary ldap libraries by default.
Screenshots/recordings
No response
Superset version
master / latest-dev
Python version
3.10
Node version
Not applicable
Browser
Not applicable
Additional context
The following appears in the logs:
2024-10-08 16:25:15,031:ERROR:flask_appbuilder.security.manager:python-ldap library is not installed
Checklist
- [X] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [X] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [X] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
A similar issue was discussed in a closed thread regarding the installation of the python-ldap package [1]. Here's a suggested solution:
-
Access the Superset container:
docker exec -it <container_name> /bin/bash -
Install necessary packages:
apt-get update -y && apt-get install -y python3-dev libldap2-dev libsasl2-dev ldap-utils tox lcov valgrind -
Install
python-ldap:pip install python-ldap -
Restart the container if needed:
docker-compose -f docker-compose-non-dev.yml restart
This should help resolve the issue with LDAP configuration in Superset 4.1.0 rc2.
To continue talking to Dosu, mention @dosu.
Is there a reason superset isn't standardizing on python 3.11? It would save a great deal of frustration, given that most of the distro-supplied python packages are built for 3.11.
I'm going to have to give up on rc2. I keep running into one problem after another. I bit the bullet and built the ldap module. Now when Superset starts up, it seems to not know if it should be using postgres or SQLite. superset db upgrade generates numerous foreign key errors, the postgres database is only partially populated, and I get a 500 error when I try to access the UI.
I can understand wanting to slim down the build, but there has to be a better way than just stripping out all the drivers and forcing the burden onto users. This is a mess.
Thanks for testing it out. How are you deploying - with the PyPI package? I ran into some similar problems when I tried out 4.1.0rc2, see my comment here: https://github.com/apache/superset/discussions/29999#discussioncomment-10490167 I'm deploying from Docker and for the purposes of testing I just switched to 4.1.0rc2-dev which fixed things. But maybe we can/should do better to support folks who want a production-ready image without building their own.
In addition to python-ldap I think other missing packages include the Postgres and SQLite drivers, openpyxl (for Excel uploads), pillow (?) for screenshots of alerts & reports.
I was creating a docker container, using the superset provided docker container as the base and adding the required drivers.
And yes, it seems like all of that is missing. I think sqlite may be included by default because I didn't install that myself, but everything else is missing. Seems like too much has been stripped out.
I think a better option would be to provide a curated superset build container, where a user just have to provide a list of the desired drivers, and the build container will take care of making sure everything is installed.
I'm revisiting this since things don't appear to have changed since the rc.
What I've found so far:
- I had to switch from
pip install psycopg2topip install psycopg2-binary. pip install python-ldapfails because it requires gccapt-get install python3-ldapworks, however, it installs libraries for python 3.11. So these are not available when running under 3.10- switching to 4.1.1rc1-py311, it looks like superset rolls it's own version of python (/usr/local/lib/python3.11/site-packages), so it still doesn't find the python3-ldap library (under /usr/lib/python3/dist-packages)
I'm still trying to figure out the cleanest option for making this work:
- symlinking everything under /usr/lib/python3/dist-packages, to /usr/local/lib/python3.11/site-packages
- installing gcc onto the container, let pip do it's thing, then remove it again.
- Another option I can't think of?
I was successful by installing build-essential, and then autoremoving it after it was done, to avoid ballooning the layer with unnecessary stuff. Here is my working Dockerfile. I'm using a custom init command for some additional stuff but you can ignore that.
FROM apache/superset:4.1.1rc1-py311
USER root
COPY superset_config.py superset-init.sh superset-init-env.sh /app/
COPY --from=common get-secret.sh discover-roleid.sh /usr/local/bin/
RUN <<EOF
apt-get update
apt-get install -y curl links zip unzip basez
#apt-get purge -y --auto-remove
EOF
ENV PIP_ROOT_USER_ACTION=ignore
RUN <<EOF
apt-get install -y build-essential
pip install --upgrade pip
pip install --upgrade --upgrade setuptools
pip install \
shillelagh[gsheetsapi] \
gunicorn[gevent] \
pymssql \
elasticsearch-dbapi \
elasticsearch-dbapi[opendistro] \
trino[sqlalchemy] \
pip-system-certs \
playwright \
Flask-Limiter[redis] \
Pillow \
psycopg2 \
python-ldap \
#mysqlclient \ #gives an error for some reason
apt-get autoremove -y build-essential
EOF
USER superset
CMD [ "/app/superset-init.sh" ]
@ilsaloving thank you so much for sharing that! This got me thinking maybe the Superset project should document how to build your own image like this, so that more people feel comfortable doing it. It's necessitated by the change in 4.1.0 but is probably a good practice anyway.
I started a discussion about it with a link back to your Dockerfile: https://github.com/apache/superset/discussions/31327
Should we keep this open? Not sure if this issue will lead to a docs update, or if it's case closed :)
I think we can close it since: it's not going to lead to an immediate code change; we now have a discussion about a docs update that could come from this; and @ilsaloving has included the workaround here.