piccolo
piccolo copied to clipboard
Add an API for upserting
Based on this discussion: https://github.com/piccolo-orm/piccolo/discussions/248
We don't currently have an API for upserting.
This would be based around the ON CONFLICT
clause you can add to INSERT
queries.
https://www.postgresql.org/docs/current/sql-insert.html#SQL-ON-CONFLICT
There will probably be two APIs needed. Something for insert
queries, and a convenience method for objects
queries.
For example:
# For insert
Band.insert(Band(name='Pythonistas')).on_conflict(...).run_sync()
# For objects
Band.objects().upsert(Band.name == 'Pythonistas', {Band.popularity: 5000}).run_sync()
I would like to work on this if It is open and unassigned.
@AbhijithGanesh Great, thanks. This might be a tricky issue though, just so you're aware!
Thanks for the heads up , I'll read upon docs and get this reviewed before it gets merged
Clarification: The upserting method/process is like HTTP PUT right?
This requires more interaction with the maintainer than I thought, where can I reach out to you? @dantownsend
Based on this discussion: #248
We don't currently have an API for upserting.
This would be based around the
ON CONFLICT
clause you can add toINSERT
queries.https://www.postgresql.org/docs/current/sql-insert.html#SQL-ON-CONFLICT
There will probably be two APIs needed. Something for
insert
queries, and a convenience method forobjects
queries.For example:
# For insert Band.insert(Band(name='Pythonistas')).on_conflict(...).run_sync() # For objects Band.objects().upsert(Band.name == 'Pythonistas', {Band.popularity: 5000}).run_sync()
So I am implementing the insert part at query>methods>insert.py and upsert as a new file. Would that be okay?
You're best reaching me here - I don't use much else besides email.. You can start a separate discussion thread if you prefer.
That's right - an UPSERT is a bit like a PUT. When an insert fails due to a unique constraint, it will update the existing row instead of raising an error.
Here are the docs:
https://www.postgresql.org/docs/current/sql-insert.html#SQL-ON-CONFLICT https://sqlite.org/lang_conflict.html
It's a tricky one - I need to think through the API a bit, as the SQLite and Postgres implementations are a bit different.
Feel free to experiment with it while I give it some more thought, or feel free to pick up another issue in the meantime.
On a high level, the insert method and upsert method would ideally be sharing similar database querystrings right(Exclude the query, just the structure)
That's right, it will most likely be an on_conflict
argument to Insert
. If you look at the codebase, there are many 'delegates'. For example:
https://github.com/piccolo-orm/piccolo/blob/8ec9d10313c9ac44916f01e27c9ef58eb22ae0d9/piccolo/query/mixins.py#L38-L53
We would have a new one called OnConflict
.
You can set the on conflict value using the constructor:
Band.insert(Band(name='Pythonistas'), on_conflict=Conflict.do_nothing).run_sync()
Or you can use a method, which sets the same underlying value on the OnConflict
delegate:
Band.insert(Band(name='Pythonistas')).on_conflict(Conflict.do_nothing).run_sync()
And Conflict
would be an Enum
.
Thanks for this vital information, Can you please assign me to this issue
Can you quickly tell me about delegates quickly? I know they're a part of the mixins, can you please give me a description(I searched the code base for onconflict and couldn't find relevant references)
There is no code related to on conflict
yet. A delegate is just a class which stores data - if you look at mixins.py
there's a bunch of them, and they're all pretty similar. The reason we use them instead of inheritance is because it's better for type annotations.
got it, I shall use them
Heyy, quick question, can I use exists.py to validate? I have currently based my function design on update and insert as of now,
You can use exists.py in unit tests, but it should be possible to implement the on conflict
changed in a single query, by extending the existing Insert
class.
@dantownsend I am super sorry for leaving this on open! I shall finish this ASAP!
That's right, it will most likely be an
on_conflict
argument toInsert
. If you look at the codebase, there are many 'delegates'. For example:https://github.com/piccolo-orm/piccolo/blob/8ec9d10313c9ac44916f01e27c9ef58eb22ae0d9/piccolo/query/mixins.py#L38-L53
We would have a new one called
OnConflict
.You can set the on conflict value using the constructor:
Band.insert(Band(name='Pythonistas'), on_conflict=Conflict.do_nothing).run_sync()
Or you can use a method, which sets the same underlying value on the
OnConflict
delegate:Band.insert(Band(name='Pythonistas')).on_conflict(Conflict.do_nothing).run_sync()
And
Conflict
would be anEnum
.
What slots would on_conflict take? Enum?
@dantownsend can you please explain which slot on_conflict should take
@AbhijithGanesh You just create a new one:
@dataclass
class Insert(Query):
__slots__ = ("add_delegate", "on_conflict_delegate")
Slots are a minor performance enhancement which tells Python that only those attributes are allowed, which makes it slightly more memory efficient.
Is anyone working on this ?
@deserve-shubham Not at the moment, would be nice to add it.
@deserve-shubham Please work on this if you can, I'm unable to commit to this!
@dantownsend I don't know if you meant something like this, but here is my attempt for upserting api. This will also solve duplicates for manually entering rows for M2M (#707) because we will be able to set a conflict in this
async def _run(self):
rows = self.rows
unsaved = [i for i in rows if not i._exists_in_db]
if unsaved:
await rows[0].__class__.insert(
*unsaved, on_conflict=Conflict.do_nothing <- set do_nothing in conflict as suggested in the issue
).run()
Another question. Should Sqlite versions less than 3.24.0
also be supported because the conflict syntax is different.
@sinisaos we can detect the sqlite version and potentially change the syntax accordingly. It's accessible via SQLiteEngine. It would be good to get the on conflict clause done.
@sinisaos we can detect the sqlite version and potentially change the syntax accordingly. It's accessible via SQLiteEngine. It would be good to get the on conflict clause done.
It should be something like this (but I can't try now and need to test it with some older version of Sqlite)
if (
engine_type == "sqlite"
and self.table._meta.db.get_version_sync() < 3.24
):
if self.on_conflict_delegate._on_conflict is not None:
if self.on_conflict_delegate._on_conflict.value == "DO NOTHING":
base = f'INSERT OR IGNORE "{self.table._meta.tablename}"'
else:
base = f'INSERT OR REPLACE "{self.table._meta.tablename}"'
else:
base = f'INSERT INTO "{self.table._meta.tablename}"'
I don't know if you had time to look at the changes I made in here. If it's not good enough for PR (or it's not how you imagine conflict implementation), I wouldn't bother with it anymore and leave it to someone else.