sqlx
sqlx copied to clipboard
error occurred while decoding column 43: unexpected null; try decoding as an `Option` - But struct field *IS* already Option<string>
Bug Description
I'm running into what I believe must be a bug in SQLx.
I get this error when using query_as! with my struct:
error occurred while decoding column 43: unexpected null; try decoding as an `Option`
In order to identify which column was causing the problem, I replaced a bunch of columns in the select statement with static values, ensuring they would not be null, like this
SELECT
table.column as column_name
replaced with this
SELECT
'value' as column_name
Which worked. I identified the column, which indeed is NULL as a result of a LEFT JOIN in the query (there are several chained LEFT JOIN).
However, the field in the struct is already Option<String>, so SQLx should have no problem populating it with None.
This is a bug and clearly not intended behaviour. I have no idea what kind of edge case triggered this issue, but the query contains several successive LEFT JOIN clauses where columns from the previous LEFT JOIN are used in the ON clause for the next LEFT JOIN, fwiw.
Important Note
The error does not happen if I do this:
SELECT
NULL as column_name
with the problematic column.
So, despite SQLx claiming that the unexpected null in that field is a problem, it only seems to happen when the null comes as a result of the JOIN. If the NULL is a constant value, it has no issue populating the struct field with None.
Minimal Reproduction
https://github.com/reasv/mitsuba/blob/master/src/db.rs#L667C1-L668C1
The columns causing the issue are all three of file_sha256, mitsuba_file_hidden and thumbnail_sha256
Info
- SQLx version: 0.7.4
- SQLx features enabled: [ "postgres", "macros", "migrate", "runtime-tokio"]
- Database server and version: Postgres, from docker container
postgres:13 - Operating system: Windows 11 (client) / Docker on Linux (Postgres)
rustc --version:rustc 1.79.0 (129f3b996 2024-06-10)
I've run into this problem many times. You have to use the function instead of the macro. The sqlx prepare compiler is not generating the query plan correctly.
https://docs.rs/sqlx/latest/sqlx/fn.query_as.html
@reasv this can happen if the database server returns columns in a different order than the server it was compiled with. I see your SELECT starts with posts.*, I recommend listing those columns explicitly instead.
I recommend listing those columns explicitly instead.
That's highly impractical given there's 40 columns and I need to make a similar query in several different places - which would force me to copy and paste it everywhere. I tried to concatenate strings using compile time macros in order to avoid repetition, but that doesn't seem possible with the query_as! macro.
ORMx has some kind of different macro, but last time I checked it was still on SQLx 0.6.x
this can happen if the database server returns columns in a different order than the server it was compiled with.
I don't think this is exactly what's happening because the error also happens on my development machine when compiling and running using cargo run, the server in question being available and connected - there is no other server it could have used as a reference.
You have to use the function instead of the macro.
I'd rather avoid losing type safety as it has been fundamental to easily refactoring the code in the past after schema changes.
I will use a different workaround:
SELECT
posts.*,
CASE
WHEN
files.sha256 IS NOT NULL
THEN files.sha256
ELSE NULL
END AS file_sha256,
CASE
WHEN
thumbnails.hidden IS NOT NULL
THEN thumbnails.hidden
ELSE NULL
END AS mitsuba_file_hidden,
CASE
WHEN
thumbnails.sha256 IS NOT NULL
THEN thumbnails.sha256
ELSE NULL
END AS thumbnail_sha256
FROM posts
I just wish I could avoid copying and pasting it.
It's not really documented, but the query macros will accept a query string in fragments separated by +. This is specifically to support the generation of query strings by macros.
concat!() doesn't work because macros are expanded from the outside-in, though we could always add a special case for it. Alternatively, we could utilize an attribute proc macro as arguments in attributes are eagerly expanded, that's how you can do #[doc = include_str!("foo.md")] which we've utilized in a few places. One of a thousand big refactors I've been meaning to do when I can find the time.
Can you post the output of cargo expand for the query_as!() invocation? Also, the output of EXPLAIN (VERBOSE, FORMAT JSON) <query>.
Thinking about it, it's distinctly possible that the macros are incorrectly inferring that the columns are not nullable. They effectively emit code like field: row.try_get::<InferredType>()?.into() and that .into() could be hiding faulty null inference. The nullability inference errs on the side of false positives to try to avoid this exact situation, but it's far from perfect.
You could test for this by using a null override: CASE ... END AS "file_sha256?", etc.
JSON:
https://gist.github.com/reasv/62df861d2bd96c74e69f1906e5e1f082
SQL:
https://gist.github.com/reasv/23f6b872800585126e1fe7596d456a22
You could test for this by using a null override:
CASE ... END AS "file_sha256?", etc.
Given that the current workaround
CASE
WHEN
files.sha256 IS NOT NULL
THEN files.sha256
ELSE NULL
END AS file_sha256
works, I would assume this should work also, but without the need for the case statement. I'll try it. I didn't know it was possible.
It's not really documented, but the query macros will accept a query string in fragments separated by
+. This is specifically to support the generation of query strings by macros.
Not entirely sure how to use this given that, as you say, macros are expanded from the outside in.
It also doesn't seem to work:
But I am probably misunderstanding your suggestion.
I would assume this should work also, but without the need for the case statement. I'll try it. I didn't know it was possible.
SELECT
posts.*,
files.sha256 AS \"file_sha256?\",
thumbnails.hidden AS \"mitsuba_file_hidden?\",
thumbnails.sha256 AS \"thumbnail_sha256?\",
appears to solve the issue and is not nearly as verbose as my workaround using CASE statements.
This is a good enough solution for me, especially if I can figure out how to avoid repeating the same query fragment using + as you suggested.
As I mentioned in my previous post, it doesn't seem to work, at least not in the way I expected.
For the next person TLDR use "field_name AS field_name?", note the question mark.
Thank you. Using the "?" to make it an option really helped me.
I'll just add that if you use raw strings you don't need the " for example
r#"
SELECT
posts.*,
files.sha256 AS "file_sha256?",
thumbnails.hidden AS "mitsuba_file_hidden?",
thumbnails.sha256 AS "thumbnail_sha256?",
"#
My use case is where the permissions field comes from the roles table after the join and I was getting "unexpected null; try decoding as an Option". See my query below.
let query = sqlx::query!(
r#"SELECT user_name, is_enabled, permissions AS "permissions?"
FROM users
LEFT JOIN roles ON users.assigned_role = roles.role_id
WHERE user_name = $1;"#,
username,
);
PS. FYI The same query on MySql just returns an empty string for permissions and no error...
Flagging this is still an issue.
Confirming this bug on v0.8.5 and postgres v14.17.
In my case, it was this query containing a LEFT JOIN as well.
SELECT m.sport as "sport: SportType",
m.season_id as "season_id",
s.title as "season_name?" -- removing the "?" will cause the error
FROM match m
LEFT JOIN season s ON s.id = m.season_id
WHERE m.id = $1
Not exactly the same, but I feel it's related enough to comment here. I have this query with 2 left joins to link up a many to many relation between tables:
select name, description, public, allow_hq_dl, ARRAY_REMOVE(ARRAY_AGG(p.id), NULL) AS "photos!"
from gallery g
left join gallery_photo gp on g.id = gp.gallery_id
left join photo p on gp.photo = p.
group by g.id
This query now works because of the ARRAY_REMOVE(..., NULL). Without that a NULL value would be in the array due to the left joins creating a row for the gallery row, with a NULL in the p.id column.
Now this is all working as expected, but the typing isn't working correctly. Without the ARRAY_REMOVE(xxx, NULL), there is a NULL value possible, so the type of photos should be vec<Option<String>>. But the compiler won't allow that, tells you it should be vec<String>.