database-js
database-js copied to clipboard
UUIDs Get Returned Incorrectly
I have a column (id_new) on a dev branch to test using UUID7 for PKs. When doing a SELECT query the UUIDs seem to be being returned incorrectly.
Example Table
Column Name Datatype Collation PK
id INT utf8mb4_0900_ai_ci Yes
id_new CHAR(36) utf8mb4_bin
domain VARCHAR(255) utf8mb4_0900_ai_ci
I updated the column for this record like so:
import { uuidv7 } from "uuidv7";
const domainId = uuidv7();
await c.env.CONNECTION.transaction(async tx => {
await tx.execute("UPDATE domains SET id_new = ? WHERE domain = ?", [domainId, 'domain.com')
})
When viewed via MySQL Workbench app, id_new shows as 018e1630-ca36-7188-94f9-7120cb925faf
However, when running a SELECT query and returning the results it shows as the following:
"rows": [
{
"HEX(id_new)": "30313865313633302D636133362D373138382D393466392D373132306362393235666166"
}
]
My SELECT statement
SELECT HEX(id_new) FROM domains WHERE domain = 'domain.com'
Just to add here, I suspect in this case, it is related to the collation utf8mb4_bin, but want to see what the API response is for this type. Something is definitely wrong here.
So looking into it more, the utf8mb4_bin collation itself causes MySQL to interpret this column as binary and not utf8mb4 anymore, regardless of the actual charset. And because we see this as BLOB and BINARY, we treat it as raw binary data since... that's what those represent.
I think we need some extra logic here to work around this.
Given the table:
CREATE TABLE `dumb` (
`a` text,
`b` char(10),
`c` text COLLATE utf8mb4_bin,
`d` char(10) COLLATE utf8mb4_bin
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
And inserting one row:
insert into dumb values ('hello-a', 'hello-b', 'hello-c', 'hello-d');
We get the results back:
{
"result": {
"fields": [
{
"name": "a",
"type": "TEXT",
"table": "dumb",
"orgTable": "dumb",
"database": "mattdb",
"orgName": "a",
"columnLength": 262140,
"charset": 255,
"flags": 16
},
{
"name": "b",
"type": "CHAR",
"table": "dumb",
"orgTable": "dumb",
"database": "mattdb",
"orgName": "b",
"columnLength": 40,
"charset": 255
},
{
"name": "c",
"type": "BLOB",
"table": "dumb",
"orgTable": "dumb",
"database": "mattdb",
"orgName": "c",
"columnLength": 262140,
"charset": 255,
"flags": 144
},
{
"name": "d",
"type": "BINARY",
"table": "dumb",
"orgTable": "dumb",
"database": "mattdb",
"orgName": "d",
"columnLength": 40,
"charset": 255,
"flags": 128
}
],
"rows": [
{
"lengths": [
"7",
"7",
"7",
"7"
],
"values": "aGVsbG8tYWhlbGxvLWJoZWxsby1jaGVsbG8tZA=="
}
]
}
}
The first two are TEXT/CHAR accordingly, which is correct, while the last 2 became BLOB/BINARY.
This is fundamentally why we were handling these "wrong". I think we need to more specifically infer from the charset and flags.