stackexchange-dump-to-postgres icon indicating copy to clipboard operation
stackexchange-dump-to-postgres copied to clipboard

`DROP TABLE` statements are only executed in the `public` schema

Open rodrigo-morales-1 opened this issue 4 years ago • 0 comments

Prologue

I've noticed that *_pre.sql files have two statements: DROP TABLE and CREATE TABLE. See some examples below.

https://github.com/Networks-Learning/stackexchange-dump-to-postgres/blob/49d8358a83bb573c63c2715b122afb2fca98376c/sql/Users_pre.sql#L1-L2

https://github.com/Networks-Learning/stackexchange-dump-to-postgres/blob/49d8358a83bb573c63c2715b122afb2fca98376c/sql/Badges_pre.sql#L1-L2

https://github.com/Networks-Learning/stackexchange-dump-to-postgres/blob/49d8358a83bb573c63c2715b122afb2fca98376c/sql/Tags_pre.sql#L1-L2

These files are executed in the public schema because the connection retrieves its configuration from the dbConnectionParam variable which doesn't consider the schema specified by the user.

https://github.com/Networks-Learning/stackexchange-dump-to-postgres/blob/49d8358a83bb573c63c2715b122afb2fca98376c/load_into_pg.py#L185-L203

The problem

This implies that if the user runs the command in a given schema twice, an error will occur because the following occurs

  1. The user executes the script and specifies a given schema.
    1. The DROP TABLE is executed in the public schema.
    2. The CREATE TABLE is executed in the public schema.
    3. The table is moved to the given schema.
  2. Again, The user executes the script and specifies a given schema
    1. The DROP TABLE is executed in the public schema even though the table exists in the specified schema because it was moved in the previous execution of the command.
    2. The CREATE TABLE is executed in the public schema.
    3. The table is moved from the public schema to the given schema which fails because the DROP TABLE didn't drop the table in the given schema but in the public schema.

Example

The following code blocks shows an example of this happening

$ python load_into_pg.py \
-f ../cardano.meta/Users.xml \
-t Users \
-n foo \
-d stackexchange
This will drop the Users table. Are you sure [y/n]?y
Pre-processing ...
Pre-processing took 0.0 seconds
Processing data ...
Table 'Users' processing took 0.0 seconds
Post processing ...
Post processing took 0.0 seconds
python load_into_pg.py \
-f ../cardano.meta/Users.xml \
-t Users \
-n foo \
-d stackexchange
This will drop the Users table. Are you sure [y/n]?y
Pre-processing ...
Pre-processing took 0.0 seconds
Processing data ...
Table 'Users' processing took 0.0 seconds
Post processing ...
Post processing took 0.0 seconds
Error in dealing with the database.
pg.Error (42P07): ERROR:  relation "users" already exists in schema "foo"

relation "users" already exists in schema "foo"

rodrigo-morales-1 avatar Aug 18 '21 02:08 rodrigo-morales-1