datasette
datasette copied to clipboard
Filters fail to work correctly against calculated numeric columns returned by SQL views because type affinity rules do not apply
I found a strange behavior, and I'm not sure if it's related to views and boolean values perhaps, or if there's something else weird going on here, but I'll provide an example that may help show what I'm seeing happen.
#!/bin/bash
echo "\"id\",\"expiration_date\"
0,2018-01-04
1,2019-01-05
2,2020-01-06
3,2021-01-07
4,2022-01-08
5,2023-01-09
6,2024-01-10
7,2025-01-11
8,2026-01-12
9,2027-01-13
" > test.csv
csvs-to-sqlite test.csv test.db
sqlite-utils create-view --replace test.db test_view "select id, expiration_date, case when julianday('NOW') >= julianday(expiration_date) then 1 else 0 end as has_expired FROM test"
datasette test.db
Thanks again and let me know if you want me to provide anything else!
Oh this is fascinating! I replicated the bug (thanks for the steps to reproduce) and it looks like this is down to the following:

Against views, where has_expired = 1
returns different results from where has_expired = '1'
This doesn't happen against tables because of SQLite's type affinity mechanism, which handles the type conversion automatically.
Relevant section of the SQLite documentation: 3.2. Affinity Of Expressions:
When an expression is a simple reference to a column of a real table (not a VIEW or subquery) then the expression has the same affinity as the table column.
In your example, has_expired
is no longer a simple reference to a column of a real table, hence the bug.
Then 4.2. Type Conversions Prior To Comparison fills in the rest:
SQLite may attempt to convert values between the storage classes INTEGER, REAL, and/or TEXT before performing a comparison. Whether or not any conversions are attempted before the comparison takes place depends on the type affinity of the operands.
I wonder if this will be a problem with generated columns, or with SQLite strict tables?
My hunch is that strict tables will continue to work without any changes, because https://www.sqlite.org/stricttables.html says nothing about their impact on comparison operations. I should test this to make absolutely sure though.
Generated columns have a type, so my hunch is they will continue to work fine too.
Thinking about options for fixing this...
The following query works fine:
select * from test_view where cast(has_expired as text) = '1'
I don't want to start using this for every query, because one of the goals of Datasette is to help people who are learning SQL:
- #1613
If someone clicks on "View and edit SQL" from a filtered table page I don't want them to have to wonder why that cast
is there.
But... for querying views, the cast
turns out to be necessary.
So one fix would be to get the SQL generating logic to use casts like this any time it is operating against a view.
An even better fix would be to detect which columns in a view come from a table and which ones might not, and only use casts for the columns that aren't definitely from a table.
The trick I was exploring here might be able to help with that:
- #1293
The alternative to using cast
here would be for Datasette to convert the "1"
to a 1
in Python code before passing it as a param.
This feels a bit neater to me, but I still then need to solve the problem of how to identify the "type" of a column that I want to use in a query.
No, I think I need to use cast
- I can't think of any way to ask SQLite "for this query, what types are the columns that will come back from it?"
Even the details from the explain
trick explored in #1293 don't seem to come back with column type information: https://latest.datasette.io/fixtures?sql=explain+select+pk%2C+text1%2C+text2%2C+[name+with+.+and+spaces]+from+searchable_view+where+%22pk%22+%3D+%3Ap0&p0=1
Note that Datasette does already have special logic to convert parameters to integers for numeric comparisons like >
:
https://github.com/simonw/datasette/blob/c4c9dbd0386e46d2bf199f0ed34e4895c98cb78c/datasette/filters.py#L203-L212
Though... it looks like there's a bug in that? It doesn't account for float
values - "3.5".isdigit()
return False
- probably for the best, because int(3.5)
would break that value anyway.
Also made me realize that this query:
select * from sortable where sortable > :p0
Only works here thanks to the column affinity thing kicking in too: https://latest.datasette.io/fixtures?sql=select+*+from+sortable+where+sortable+%3E+%3Ap0&p0=70