sqlc
sqlc copied to clipboard
mysql blob and binary types are converted to text and varchar
I noticed this happening while working on the Kotlin codegen. By the time it gets down to codegen in the intermediate *compiler.Column values, blob becomes text and varbinary becomes varchar. I couldn't track down what was causing this. I would have expected both these types to turn into []byte in codegen, and mysql_types.go does handle it that way.

Reproduced here: https://play.sqlc.dev/p/8942c5c89e06ea0dc35975ba047936bdf5b0b9ba3daf502103c49c57e457ba5b
If I understand correctly, parsing of SQL statements in sqlc is done through internal/engine/dolphin, in the case of MySQL. This package imports github.com/pingcap/tidb/parser. On sqlc tag 1.14.0, this package is required through go.mod as github.com/pingcap/parser v0.0.0-20210914110036-002913dd28ec.
Although the commit hash 002913dd28ec does not match, the datetime stamp would match to this commit, which is 2629 commits behind their latest commit at time of writing. (Perhaps there was a force-push or rebase at some point to the repository which affects existing commit hashes).
Regardless of the imported package's age, the types defined in github.com/pingcap/tidb/parser/mysql/type.go neatly distinguishes the various column types. However, going up one level, to their package types, we see the following mapping from MySQL types to strings:
var type2Str = map[byte]string{
mysql.TypeBit: "bit",
mysql.TypeBlob: "text",
// ...
mysql.TypeVarchar: "varchar",
// ...
}
(note that VarBinary does not exist)
Back to sqlc, when converting the CREATE TABLE statement, the MySQL type is converted using types.TypeStr (i.e. from github.com/pingcap/tidb/parser/types) here, on v1.14.0. It seems to me that's the cause of the conversion from BLOB (in SQL) to TEXT, and consequently into sql.NullString.
Looking at the documentation of MySQL 8.0 for VARBINARY, it says that the column type is similar to VARCHAR (with some interpretation differences). The same is written for BLOB and TEXT columns. I'm not familiar with the entire internal typing system of MySQL, but given the documentation, I can see why BLOB translates to TEXT, and VARBINARY to VARCHAR.
To fix this on the sqlc side, it looks like using types.TypeToStr (instead of types.TypeStr) would keep the blob/binary part intact during the earlier mentioned conversion. (I've used the commit hash that presumably matches the go.mod required version to show that this function exists at that version).