coaster
coaster copied to clipboard
failsafe_add should use UPSERT where available
Nested transactions are nice, but UPSERT is more reliable. failsafe_add
should use it when it detects the underlying engine supports it.
SQLAlchemy supports custom compilation for standard statements like INSERT (docs). We could have tables specifically flag support for upserts in __table_args__
like this:
class MyTable(db.Model):
__tablename__ = 'my_table'
__table_args__ = ({'upsert': None},)
This should generate an ON CONFLICT DO NOTHING
clause. The alternative ON CONFLICT DO UPDATE
is trickier as it requires specifying specific columns to update. In this case the syntax is:
class MyTable(db.Model):
__tablename__ = 'my_table'
__table_args__ = ({'upsert': ['col1', 'col2', …]},)
Some fields like id
and created_at
should not be updated in case of a conflict, while most other fields need to be. The table should whitelist (or blacklist?) fields to update.
Finally, this approach to upsert will result in the parameter to failsafe_add
no longer guaranteed to be the same entity persisted to disk. session.merge
can't be used either because the primary key is not guaranteed to match, especially with UUID primary keys that are currently generated client-side in Coaster. Since PostgreSQL's ON CONFLICT
clause does not tell us whether a conflict occurred, failsafe_add
has no option but to attempt reloading each time, even if the object is no longer needed. In this case the caller should avoid passing filters to failsafe_add
so it skips the reload (and failsafe_add
should support optional filters).
Proposed syntax change for the upsert
key in __table_args__
:
-
None
: No upsert at all -
[]
:ON CONFLICT DO NOTHING
-
['col1', 'col2', …]
:ON CONFLICT DO UPDATE SET col1 = :col1, col2 = :col2, …
Problem: PostgreSQL's docs say ON CONFLICT DO UPDATE
requires a mandatory conflict target (optional for DO NOTHING
), so the SQL syntax will be like one of the following:
-
ON CONFLICT (col_name) DO UPDATE …
-
ON CONFLICT ON CONSTRAINT constraint_name DO UPDATE …
This means we can't simply specify {'upsert': ['col1', 'col2', …]}
in __table_args__
. We also need somewhere to specify the conflict target in either of its forms.
It is also possible to use an INSERT ... WHERE NOT EXISTS
clause that works on PostgreSQL < 9.5.
- http://stackoverflow.com/a/36377530/78903
- http://stackoverflow.com/a/17991647/78903
This is not as reliable as an upsert, but is still better than making distinct roundtrips to the database, and could still be protected inside a nested transaction.
SQLAlchemy has support for upserts now but only with insert
statements, not at the declarative level.