chore(docs): reorder, clarify PyPI installation
SUMMARY
I tried installing from PyPI and found that some steps were missing or out of order. I removed irrelevant text, added other instructions, and reordered things.
Any concerns about suggesting that people install into a directory /superset and set up their virtual environment there?
There's now some redundancy with Configuring Superset docs, as they also talk about creating a superset_config.py file. Because the proper time to create one is in the install phase, when you are specifying a SECRET_KEY, I feel this belongs in this page. I think we should add the corresponding creating-a-config content to the install methods, then remove from Configuring Superset as everyone will have done it by that point.
TESTING INSTRUCTIONS
Follow these steps on a fresh Linux machine and see if they work. Please test!
Those are great additions, thanks @surapuramakhil !
I think we should add the corresponding creating-a-config content to the install methods, then remove from Configuring Superset as everyone will have done it by that point.
I agree, Configuring Superset should be about options available after Installation, much like day-to-day administration of a environment.
I agree, Configuring Superset should be about options available after Installation, much like day-to-day administration of a environment.
@artofcomputing I have added this to the BugHerd task board.
Do we still need eyes on this @sfirke? Sorry it seems to have slipped under the collective radar.
@rusackas yes this still needs a review. It's tricky b/c I'm not enough of an expert to feel 100% I'm right but also I'm confident this improves on docs steps that are straight-up missing or out of order.
I'm a totally new Superset user (on linux) and ran into problems when following the PyPI installation instructions that are in the live docs. This PR is super useful, as it shows what the missing configuration steps are. The instructions clarified things a lot and worked well, but in the end I called the venv folder "venv" instead, as that's what I'm used to. I'm also used to these virtual environment folders being totally rebuildable by pip (and so not containing any custom files), so I put the Superset config file in the main folder instead. That is, I went with the following structure:
superset/ # Project folder
superset/venv/ # Virtual environment with python and all packages and dependencies
superset/superset_config.py # Configuration file
Specifically:
mkdir superset
cd superset
python3 -m venv venv
. venv/bin/activate
python3 -m pip install apache-superset
touch superset_config.py
export SUPERSET_CONFIG_PATH=superset_config.py
echo "SECRET_KEY='$(openssl rand -base64 42)'" | tee -a $SUPERSET_CONFIG_PATH
Setting up with example data and running development server:
export FLASK_APP=superset
superset db upgrade
superset fab create-admin
superset load_examples
superset init
superset run -p 8088 --with-threads --reload --debugger
(The SUPERSET_CONFIG_PATH environment variable still needs to be set for the above to work.)
It might be helpful to add a comment about it not in general being a good idea to set FLASK_APP globally, as one might want to run several Flask apps on the same machine.
@olof-dev that is very helpful feedback, thank you! I will incorporate that, including naming the venv venv -- that is standard as you point out, and having it inside the superset directory addresses my concern about collision with another project. Adding your feedback should unblock this and make it merge-able. Thanks for letting me know that it was an improvement, too.
A commenter in Slack shares this feedback:
I have the step
pip install --upgrade setuptools pipbefore installing apache-superset in my documentation. Also, after installing apache-superset, it could at least be mentioned that now would be a good time to install other packages which are needed for running superset (in our case it would be ldap-packages and some special packages that are needed to communicate with certain databases). Besides, I am missing a system requirements section, especially since the newer versions of superset require python3.9 (which took me half a day to find out that it was that requirement that stopped me from getting the newest version and I still didnt get 4.0.2 working on my linux 20.04-server).
Taking stock today: I need to incorporate feedback from @olof-dev above and the Slack comment, then let's merge this as an improvement on what we have now, even if it's imperfect.