sqlite-utils
sqlite-utils copied to clipboard
`table.upsert/upsert_all` fails to write row when `not_null` is absent and the schema definition includes `not_null`
I found a bug where calls to upsert() and upsert_all() don't write rows if:
- There's a
not_nullkwarg on the table definition, and - You don't pass
not_nullin theupsert()call
This doesn't affect insert() or insert_all().
Repro example
from sqlite_utils import Database
db = Database(":memory:")
db["birds"].create(
{"id": int, "name": str},
pk="id",
not_null={"name"},
)
db["birds"].insert({"id": 1, "name": "flamingo"})
print("insert: ", len(list(db["birds"].rows)))
db["birds"].insert({"id": 2, "name": "goldfinch"}, not_null={"name"})
print("insert, not_null: ", len(list(db["birds"].rows)))
db["birds"].upsert({"id": 3, "name": "loon"}, pk="id")
print("upsert: ", len(list(db["birds"].rows)))
db["birds"].upsert({"id": 4, "name": "stork"}, pk="id", not_null={"name"})
print("upsert, not_null: ", len(list(db["birds"].rows)))
db["birds"].insert_all([{"id": 5, "name": "magpie"}])
print("insert_all: ", len(list(db["birds"].rows)))
db["birds"].insert_all([{"id": 6, "name": "gull"}], not_null={"name"})
print("insert_all, not_null: ", len(list(db["birds"].rows)))
db["birds"].upsert_all([{"id": 7, "name": "dodo"}], pk="id")
print("upsert_all: ", len(list(db["birds"].rows)))
db["birds"].upsert_all([{"id": 8, "name": "crow"}], pk="id", not_null={"name"})
print("upsert_all, not_null: ", len(list(db["birds"].rows)))
Here's the output (with my comments on the right)
sqlite-utils, version 3.36
insert: 1
insert, not_null: 2
upsert: 2 <-- no write
upsert, not_null: 3
insert_all: 4
insert_all, not_null: 5
upsert_all: 5 <-- no write
upsert_all, not_null: 6
So calling upsert() and upsert_all() without passing not_null isn't actually inserting a record.
This problem goes away if:
- I add the
not_nullargument to theupsert()calls, or - Remove the
not_nullfrom the table definition
Version info
- Python:
3.12.0 - sqlite-utils:
3.36 - sqlite:
3.39.5 2022-10-14
Related issues
It seems very likely that this is related to #538, which was also about upsert() and the not_null flag.
I don't know if this is a clue, but while testing this I noticed that insert() seems to be completely ignoring the not_null parameter if it doesn't have to create the table, whereas upsert() doesn't. Here's another example:
from sqlite_utils import Database
db = Database(":memory:")
db["birds"].create(
{"id": int, "name": str},
pk="id",
not_null={"name"},
)
db["birds"].insert({"id": 1, "name": "flamingo"}, not_null={"wingspan"})
print("insert complete!")
db["birds"].upsert({"id": 2, "name": "goldfinch"}, pk="id", not_null={"wingspan"})
print("upsert complete!")
In this case the insert() completes, but the upsert() fails with sqlite3.OperationalError: table birds has no column named wingspan.
The only way to make the insert() fail is to let it auto-create the table:
from sqlite_utils import Database
db = Database(":memory:")
db["birds"].insert({"id": 1, "name": "flamingo"}, not_null={"wingspan"})
print("insert complete!")
db["birds"].upsert({"id": 2, "name": "goldfinch"}, pk="id", not_null={"wingspan"})
print("upsert complete!")
at which point I get an error AssertionError: not_null set {'wingspan'} includes items not in columns {'id', 'name'}.
A further, potentially interesting observation: this only seems to apply when you're inserting new rows. If your upsert is modifying an existing row, it works without supplying the not_null list.
Another example:
from sqlite_utils import Database
db = Database(":memory:")
db["birds"].create(
{"id": int, "name": str, "color": str},
pk="id",
not_null={"name"},
)
db["birds"].insert({"id": 1, "name": "flamingo"})
print(next(db["birds"].rows))
# initial insert
# {'id': 1, 'name': 'flamingo', 'color': None}
db["birds"].upsert({"id": 1, "name": "goldfinch"}, pk="id")
print(next(db["birds"].rows))
# modifying the existing row
# {'id': 1, 'name': 'goldfinch', 'color': None}
db["birds"].upsert({"id": 1, "color": "blue"}, pk="id")
print(next(db["birds"].rows))
# modifying a column which is allowed to be non-null
# {'id': 1, 'name': 'goldfinch', 'color': 'blue'}
(That second case is how I spotted it – I was upserting into a table with not-null columns, but modifying a nullable column on an existing row.)
i can confirm all of the above.
upserts do not work on tables where there are not-null columns, but only the inserts do not work - updates work fine.
i am not sure if there would be a better fix, but one possible fix would be to query sqlite_master if not_null is DEFAULT, and if the table already exists, use its own reckoning of which columns are not_null. I am going to implement this fix in our fork.
One other thing worth noting - the documentation asserts that the pk argument is not necessary for upserts if you are certain that the table has already been created. However, that is not the case - upsert will raise an exception if you do not provide pk. My proposed fix of using sqlite_master to get the actual 'values' for not_null would also work as a way of filling in pk for existing tables, to match the documentation.