pytest-mock-resources icon indicating copy to clipboard operation
pytest-mock-resources copied to clipboard

Redshift fixture should not enforce primary, foreign key, or unique constraints

Open langelgjm opened this issue 4 years ago • 3 comments

Is your feature request related to a problem? Please describe.

Because the Redshift fixture is based on Postgres, and Postgres enforces constraints, the Redshift fixture does not behave like Redshift. E.g., the Redshift fixture will raise integrity errors when attempting to insert rows with duplicate primary keys, whereas Redshift itself will not.

The goal of the Redshift fixture is to provide a fixture that behaves like Redshift. This includes the behavior of not respecting constraints, since that very behavior is likely to be desirable to test.

Describe the solution you'd like

We can do this ad-hoc now by something like:

alter_table_drop_constraint_statements = [
    f"ALTER TABLE {table} DROP CONSTRAINT IF EXISTS {table.name}_pkey CASCADE" for table in Base.metadata.tables.values()
]
redshift = create_redshift_fixture(Base, Statements(*alter_table_drop_constraint_statements), scope="session")

But this could also happen automatically inside the fixture using SQLAlchemy's ability to execute custom DDL on an after_create DDL event.

Describe alternatives you've considered

An alternative would be to override the SQLAlchemy Redshift dialect to not emit constraints during DDL compilation. However, Redshift does accept DDL with such constraints, and as such this solution does not provide true parity between the fixture and Redshift.

langelgjm avatar Sep 10 '19 20:09 langelgjm

Note that Redshift does enforce NOT NULL constraints so we should leave that behavior as-is.

langelgjm avatar Sep 10 '19 20:09 langelgjm

When is this actually a desirable behavior? I feel like the postgresyness of this is actually a boon. Is there a specific example of where it wouldn't be a bug if you had code that was violating your constraints?

DanCardin avatar Sep 23 '19 15:09 DanCardin

This issue arose from a specific example. I have real-world data that lacks a primary key. With the way PMR is currently set up, I am forced to define a primary key for my models.

I now need to test and replicate behavior when my real-world input data set contains a duplicate. I am forced to choose a (false) primary key. I then attempt to insert the input data and get an integrity error. In contrast, in Redshift, I get no error.

Also, philosophically, the desired and expected behavior of PMR is to provide fixtures that behave like the thing they are mocking. To the extent that a fixture deviates from the behavior of the thing it is mocking, that is a deficiency that should be corrected when feasible.

langelgjm avatar Sep 23 '19 20:09 langelgjm