dolt
dolt copied to clipboard
Syntax Error Occurs When Using AS Clause with ON DUPLICATE KEY UPDATE
Using version: v0.18.0
When executing the following query, an error ( syntax error at position 134 near 'AS') occurs. )
INSERT INTO test_campaigns (email, end_time, name, registered_time) VALUES (?,?,?,?) AS new ON DUPLICATE KEY UPDATE email = new.email, end_time = new.end_time
What could be the cause of this?
Starting with MySQL 8.0.19, it is possible to use an alias for the row, with, optionally, one or more of its columns to be inserted, following the VALUES or SET clause, and preceded by the AS keyword. Using the row alias new, the statement shown previously using VALUES() to access the new column values can be written in the form shown here:
https://dev.mysql.com/doc/refman/8.0/en/insert-on-duplicate.html
If possible, I would be grateful if you could support version 8.0.20 and later. Thank you for your consideration.
This SQL syntax extension will get better visibility in the Dolt repo, so I'm going to move it over there.
@koh789 , thank you again for letting us know. We'd love to learn about your Dolt use case too. Feel free to join our Discord or send me an email if you'd like to share.
Hi Noguchi!
Unfortunately, it looks like we don't currently support the "row alias" syntax for insert statements.
As a workaround, we do support the previous syntax (using the VALUES function), so your example in the previous syntax would look like this:
INSERT INTO test_campaigns (email, end_time, name, registered_time) VALUES (?,?,?,?) ON DUPLICATE KEY UPDATE email = VALUE(new), end_time = VALUE(end_time)
I agree that supporting newer MySQL syntax is valuable. This doesn't seem like a difficult add either: we already support referencing subquery aliases in update statements, which can be demonstrated with the following equivalent statement:
insert into test_campaigns (email, end_time, name, registered_time) (select * from (select ? as email, ? as end_time, ? as name, ? as registered_time) as new) ON DUPLICATE KEY UPDATE email = new.email, end_time = new.end_time;
So adding support for this should be straightforward. I can do it today.
I've been digging into this a bit more.
The VALUES() function is deprecated and due to be removed in a future version of MySQL. If that happens, we'll likely see a spike in users relying on row aliases like this. We definitely want to make sure that we support this before then.
Unfortunately, there are some quirks that make this less straightforward:
- As demonstrated by the example, MySQL will accept expressions like
email = new.emailin the ON UPDATE clause. This is not ambiguous, because the engine will prefer to resolve the unqualifiedemailto the table instead of the row alias. Our plan builder allows for this kind of preferential resolution in specific cases, but not generally. - The values alias may list column aliases (
VALUES (?,?,?,?) AS new(email, end_time, name, registered_time)), but it's not required to. If it doesn't, then we still need to assign names to these value columns based on the columns listed in the insert statement. (But those are also optional, in which case we'd have to fall back on the destination table columns.) - We'd want to test for situations where the correct behavior is non-obvious (What happens if the insert columns have different names from the row alias columns? What if they have the same name but different order?, etc) All of these are doable, but it makes the process more complicated than just updating our parser.
Update: I actually misunderstood how the column aliases work in this case. From the docs:
If, in addition, you use the column aliases m, n, and p, you can omit the row alias in the assignment clause and write the same statement like this:
INSERT INTO t1 (a,b,c) VALUES (1,2,3),(4,5,6) AS new(m,n,p) ON DUPLICATE KEY UPDATE c = m+n;
This is an issue for us as well. We recently updated our query builders to use the new alias syntax instead of the old VALUES() syntax to avoid the deprecation warnings emitted by MySQL, but it turns out that change broke Dolt compatibility instead.
The fact that it's now supported by the parser also makes the resulting error message a bit confusing, since the following:
INSERT INTO tbl (id, name) VALUES (1, 'test1'), (2, 'test2'), (3, 'test3') AS tbl_new
ON DUPLICATE KEY UPDATE tbl.name = tbl_new.name;
results in a table not found: tbl_new error message as opposed to an error that more clearly indicates that aliases aren't implemented here to begin with.
I apologize for the confusing error message: when we added support to the parser, we added a new error message that was supposed to be more clear:
error: insert row aliases are not currently supported; use the VALUES() function instead
If you're not seeing that error message, that's a bug.
Last time, we held off on implementing this because the name resolution rules were deceptively tricky, and we could point people to the old syntax as a workaround. But given that the old syntax is deprecated and people are migrating, we need to be able to handle the new synatx. I'm going to take another stab at it today.
Update: I have a draft PR that gets the simplest case working (row alias, no column alias): https://github.com/dolthub/go-mysql-server/pull/2534
So with this PR, both @arvidfm and @koh789's sample queries should work.
I haven't fully tested it, and it does have some caveats:
- It doesn't currently recognize column aliases (ie.
VALUES (1, 'test1'), (2, 'test2'), (3, 'test3') AS tbl_new(new_id, new_name)) - If there's a column name ambiguity between the row alias and another table, the engine is supposed to favor the insert source. So for instance, in
ON DUPLICATE KEY UPDATE tbl.name = name, the unqualifiednameis supposed to resolve totbl.name, nottbl_new.name. Currently with my draft PR, this returns an ambiguous column error.
I'm currently working on both of these issues.
@nicktobey If it helps, that name resolution behaviour doesn't actually seem to be the case in MySQL, or at least not with Percona Server 8.3. I get an error message complaining that the column name is ambiguous if I do either name = tbl_new.name or tbl.name = name. Though perhaps different MySQL versions behave differently here.
@arvidfm You're absolutely right! I was mistaken.
I got column aliases working last night too. I'm adding tests and will make the final PR shortly.
PR is ready. It took some finesse to make it work properly with out-of-order insert statements, but everything should be working now.
@nicktobey I noticed the related PR is merged while this issue is still open, is there any further work required for this or is it ready to be used?
This is in 1.39.5:
https://github.com/dolthub/dolt/releases/tag/v1.39.5
Perfect, thank you for the confirmation!
@nicktobey
Sorry for the late reply. Wow! After updating to the following versions, I was able to successfully execute the alias query! Thank you for your prompt response! I will start using it right away.