sgr
sgr copied to clipboard
Issues re-creating Splitgraph engine
I was having some issues with my Splitgraph engine, so I decided to re-install it.
-
Stop the engine
docker rm -f $(docker ps -aq)
-
Ensure that the Postgres data directory is deleted too
docker volume prune --force
-
Re-add the default engine:
$ sgr engine add Password: Repeat for confirmation: Pulling image splitgraph/engine:0.2.14... Extracting: 0%| | 0.00/1.00 [00:01<?, ?B/s] Downloading: 0%| | 0.00/1.00 [00:01<?, ?B/s] Creating container splitgraph_engine_default. Data volume: splitgraph_engine_default_data. Metadata volume: splitgraph_engine_default_metadata. Container created, ID dd7ad698e7 Initializing engine PostgresEngine default (sgr@localhost:5432/splitgraph)... Waiting for connection............... Error connecting to the engine after 12 retries Traceback (most recent call last): File "splitgraph/engine/postgres/engine.py", line 699, in _admin_conn File "psycopg2/__init__.py", line 127, in connect File "psycopg2/extras.py", line 778, in wait_select psycopg2.OperationalError: FATAL: role "sgr" does not exist error: psycopg2.OperationalError: FATAL: role "sgr" does not exist
-
I figured that this is because the default install script is to use user=sgr, password=password, port=6432 and this is now baked into my config, but the default of
sgr engine add
is different. So I remove the failed engine and try to add again with the right defaults:$ docker rm -f $(docker ps -aq) $ sgr engine add --username sgr --password password --port 6432 Creating container splitgraph_engine_default. Data volume: splitgraph_engine_default_data. Metadata volume: splitgraph_engine_default_metadata. Container created, ID 9e87df272e Initializing engine PostgresEngine default (sgr@localhost:6432/splitgraph)... Waiting for connection.... error: psycopg2.OperationalError: FATAL: password authentication failed for user "sgr"
-
This time I get "error: psycopg2.OperationalError: FATAL: password authentication failed for user "sgr". I guess this is because the old data directory is hanging around. Also remove the data directory:
docker rm -f $(docker ps -aq) docker volume prune sgr engine add --username sgr --password password --port 6432
Now it finally works!
Mitigation
I think we can mitigate this by using the following:
-
sgr engine add
should use the same defaults as the install script - when waiting for a connection,
sgr engine add
should use the settings for what it has just installed; not what is in the sgconfig (which may be different) -
sgr engine delete
should also delete the Docker volume / data directory for postgres
You can also delete the data volumes by passing -v/--with-volumes
to sgr engine delete
(https://www.splitgraph.com/docs/sgr/engine-management/engine-delete). It's not done by default since you can use sgr engine delete
to switch the Docker image or recreate the engine on a different port.
That's really useful, thanks :). My intention on this ticket was more just to communicate my confusion, rather than a bug or any kind. I was basically trying to do a clean re-install of the engine but this volume thing really tripped me up!
By the way, I actually got the rm -f
line from a message in Splitgraph: does it make sense to put a -v
there?
https://github.com/splitgraph/splitgraph/blob/ef8332b29640230f4eebcbb350a37c67285064b1/splitgraph/commandline/engine.py#L338
Is there perhaps a way for sgr engine add
to detect when it is about to pick up an existing volume and make that clear? Or, would it make sense for the sgr
CLI to manage volumes too: perhaps listing volumes in use and "dangling" volumes that aren't in use?