duckdb-web
duckdb-web copied to clipboard
Typecasting documentation / matrix contains incorrect information. Also, could be improved by making it plaintext rather than image and auto-generate it in CI
The typecasting matrix in https://duckdb.org/docs/sql/data_types/typecasting has a couple of issues:
- contains wrong information. For example, according to the chart, VARCHAR cannot be cast to any target type, contradicting the many examples in the prior data type section that illustrate how to create literal values for the various types, which often use constructs like
'0101'::BIT
- is hard to read. The orange bullets tend to drown in the sea of red crosses.
- isn't really friendly to share the info because it is an image. It's probably not very accessible too for people that rely on screenreaders.
- could be improved by conveying not merely whether a cast is possible, but also whether a cast is 'safe', i.e. whether or not the reverse cast will always yield the original value. For example, casting a TINYINT value to an INTEGER value is safe, because the resulting INTEGER value can be cast back to TINYINT again and it will be identical to the original TINYINT value.
In the hopes that it will help, the following attachments are a rough text-based matrix that provides info on cast operations that are actually supported. The matrix was generated by letting duckdb perform actual casts, the script to output that matrix is attached too.
If so desired I can pick this issue up but because the proposed change is somewhat bit I think it makes sense to understand if there is an appetite for it and if so what adjustments I should make in order for a PR to be accepted to address this issue.
Let me know!
duckdb-typecast-matrix-text.txt duckdb-typecast-matrix-script.sql.txt
Hi @rpbouman, sorry for the delay in my response.
I would be very happy to review a PR that improves the typecast page. I think the matrix format that we currently use is not great – it's hard to find information in it, it's hard to render in HTML but difficult to search as an image, etc. So, instead, a table with 3 columns (to/from/possible cast type) would work much better.
Ok I'll give it a stab in the next few days.
Typecasting rules just had a slight change: https://github.com/duckdb/duckdb/pull/10194
Therefore, please make sure you use the main
/ nightly build for running the table generation scripts.
In #2346, Mark rewrote the page on https://duckdb.org/docs/sql/data_types/typecasting
I'm planning to add casting rules as a series of tables: one table per type, with columns "castable to" (e.g. FLOAT
) and "explicitly / implicitly / not castable".
Two comments to the script (I can implement them later):
-
it's best to use
TRY_CAST
instead ofCAST
-
the casting attempts should use large enough numbers, e.g.:
SELECT TRY_CAST(1 AS INT16), TRY_CAST(100_000 AS INT16);
┌─────────────────────────┬──────────────────────────────┐ │ TRY_CAST(1 AS SMALLINT) │ TRY_CAST(100000 AS SMALLINT) │ │ int16 │ int16 │ ├─────────────────────────┼──────────────────────────────┤ │ 1 │ NULL │ └─────────────────────────┴──────────────────────────────┘