Postgres CTE insert can create unusable Record object
I am doing an insert into two tables using CTE and Returning.
In sqlalchemy core:
obj_to_insert = {'a': 4, 'b': 22}
q = db.event.insert(
).values(
event_type="blah"
).returning(
db.event.c.id
)
q1 = q.cte("eventinsert")
q2 = db.obj.insert(
).values(
event_id=q1.select(), **obj_to_insert.dict()
).returning(
db.obj
)
raw sql equivalent:
WITH eventins AS (
INSERT INTO event (event_type) VALUES ('blah') returning id
)
INSERT INTO obj (event_id, a, b)
VALUES ((select id from eventins), 4, 22)
RETURNING id, a, b
The result of fetch_one is very confused Record object.
While postgres will only return id, a, b columns the column maps and arrays in Record will also contain entries for the id returned by the CTE insert. This makes the Record object unusable as the indexes mappings are incorrect.
I believe the issue is in PostgresConnection._compile using _result_columns as it contains even the CTE returning result.
Can you please inspect what the rendered query (by sqlalchemy) looks like and what results you can achieve accessing the raw driver interface?
# Something like
async with database.connection() as connection:
raw_connection = connection.raw_connection
results = await raw_connection.<method as of the driver doc>(query: str)
....
I'd happily do that. I apologize but how does one get the fully rendered sqlalchemy query?
you can generally do print(query) or str(query) on the query object.
I've run into this issue as well, and spent some time digging into it.
The issue does indeed lie with how SQLAlchemy populates the _result_columns. It seems that it contains more columns than are actually returned by the row. This breaks the assumptions databases makes when generating the column_maps.
A couple things I've noted:
- In order for a record object to properly distinguish columns with the same name, an explicit label must be applied during the select. (see the note portion of my example)
- Using the integer index as a key seems fragile, however so does using the string. If we switch to using the string key, it breaks a built-in test that doesn't ensure query columns are unique, which is probably a fair assumption. ¯_(ツ)_/¯
I've attached a link to my solution but it's not ideal. I've also attached a file containing all my debug information. I'll keep digging into a better solution but hopefully this points someone in the right direction.
- https://github.com/encode/databases/compare/master...n3fariox:cte-bug
- debug.txt