sgr icon indicating copy to clipboard operation
sgr copied to clipboard

Issues re-creating Splitgraph engine

Open harrybiddle opened this issue 3 years ago • 2 comments

I was having some issues with my Splitgraph engine, so I decided to re-install it.

  1. Stop the engine docker rm -f $(docker ps -aq)

  2. Ensure that the Postgres data directory is deleted too docker volume prune --force

  3. Re-add the default engine:

    $ sgr engine add
    Password:
    Repeat for confirmation:
    Pulling image splitgraph/engine:0.2.14...
    Extracting:   0%|                                                                                                                   | 0.00/1.00 [00:01<?, ?B/s]
    Downloading:   0%|                                                                                                                  | 0.00/1.00 [00:01<?, ?B/s]
    Creating container splitgraph_engine_default.
    Data volume: splitgraph_engine_default_data.
    Metadata volume: splitgraph_engine_default_metadata.
    Container created, ID dd7ad698e7
    Initializing engine PostgresEngine default (sgr@localhost:5432/splitgraph)...
    Waiting for connection...............
    Error connecting to the engine after 12 retries
    Traceback (most recent call last):
      File "splitgraph/engine/postgres/engine.py", line 699, in _admin_conn
      File "psycopg2/__init__.py", line 127, in connect
      File "psycopg2/extras.py", line 778, in wait_select
    psycopg2.OperationalError: FATAL:  role "sgr" does not exist
    
    error: psycopg2.OperationalError: FATAL:  role "sgr" does not exist
    
  4. I figured that this is because the default install script is to use user=sgr, password=password, port=6432 and this is now baked into my config, but the default of sgr engine add is different. So I remove the failed engine and try to add again with the right defaults:

    $ docker rm -f $(docker ps -aq)
    $ sgr engine add --username sgr --password password  --port 6432
    Creating container splitgraph_engine_default.
    Data volume: splitgraph_engine_default_data.
    Metadata volume: splitgraph_engine_default_metadata.
    Container created, ID 9e87df272e
    Initializing engine PostgresEngine default (sgr@localhost:6432/splitgraph)...
    Waiting for connection....
    error: psycopg2.OperationalError: FATAL:  password authentication failed for user "sgr"
    
  5. This time I get "error: psycopg2.OperationalError: FATAL: password authentication failed for user "sgr". I guess this is because the old data directory is hanging around. Also remove the data directory:

    docker rm -f $(docker ps -aq)
    docker volume prune
    sgr engine add --username sgr --password password  --port 6432
    

Now it finally works!

Mitigation

I think we can mitigate this by using the following:

  • sgr engine add should use the same defaults as the install script
  • when waiting for a connection, sgr engine add should use the settings for what it has just installed; not what is in the sgconfig (which may be different)
  • sgr engine delete should also delete the Docker volume / data directory for postgres

harrybiddle avatar Jul 01 '21 14:07 harrybiddle

You can also delete the data volumes by passing -v/--with-volumes to sgr engine delete (https://www.splitgraph.com/docs/sgr/engine-management/engine-delete). It's not done by default since you can use sgr engine delete to switch the Docker image or recreate the engine on a different port.

mildbyte avatar Jul 05 '21 10:07 mildbyte

That's really useful, thanks :). My intention on this ticket was more just to communicate my confusion, rather than a bug or any kind. I was basically trying to do a clean re-install of the engine but this volume thing really tripped me up!

By the way, I actually got the rm -f line from a message in Splitgraph: does it make sense to put a -v there?

https://github.com/splitgraph/splitgraph/blob/ef8332b29640230f4eebcbb350a37c67285064b1/splitgraph/commandline/engine.py#L338

Is there perhaps a way for sgr engine add to detect when it is about to pick up an existing volume and make that clear? Or, would it make sense for the sgr CLI to manage volumes too: perhaps listing volumes in use and "dangling" volumes that aren't in use?

harrybiddle avatar Jul 05 '21 11:07 harrybiddle