Auto-casting of join keys
A user suggested that Zed automatically cast join keys as necessary to increase the likelihood of a match without needing to explicitly cast values to comparable types.
Details
Repro is with Zed commit 38763f8.
A community zync user asked the following:
Now that we have automatic sorting on joins, it would also make sense to me to do automatic casting of the left key. I can't count the number of time I had to
cast()the left key, and how many hours I spent figuring out how to get my joins working beacuse of that :slightly_smiling_face: + I am guessing zed probably knows what the cast should actually be, based on the right key type ?
They offered this specific example:
...when comparing time values with string dates (e.g., “2024-03-01”). In such cases, the string dates must be cast to the appropriate time type for meaningful comparisons.
Here's a repro of that. As the user described, this first attempt fails because the left key is a string while the right key is a time.
$ zq -version
Version: v1.14.0-16-g38763f82
$ cat datestr.zson
{datestr: "2024-03-01", word: "hello"}
$ cat time.zson
{timeval: 2024-03-01T00:00:00Z, word: "goodbye"}
$ cat join.zed
file datestr.zson
| inner join (
file time.zson
) on datestr=timeval otherword:=word
$ zq -I join.zed
[no output]
However, we can force it to work if we cast the left key to time type before the join.
$ cat join-with-cast.zed
file datestr.zson
| datestr:=time(datestr)
| inner join (
file time.zson
) on datestr=timeval otherword:=word
$ zq -I join-with-cast.zed
{datestr:2024-03-01T00:00:00Z,word:"hello",otherword:"goodbye"}
While reproducing this, I also noticed a bonus limitation that might be worth its own issue, but for now I'll just log it here: For now it seems the user is stuck doing the cast upstream in the pipeline, as if I try to do the casting right in the test for equality of the keys, that causes a syntax error.
$ cat join-other-cast.zed
file datestr.zson
| inner join (
file time.zson
) on time(datestr)=timeval otherword:=word
$ zq -I join-other-cast.zed
zq: error parsing Zed in join-other-cast.zed at line 4, column 10:
) on time(datestr)=timeval otherword:=word
=== ^ ===
I revisited this issue in the SuperSQL era as of super commit 47b1fb7, as we're now using SQL (specifically Postgres) as a baseline for functionality we may strive to match. Walking through some of the combinations of type comparisons in JOIN, there is more flexibility in Postgres when it comes to the specific example of comparing date & time values as the user originally asked about, and even more auto-cast flexibility in other systems.
Details
Here's a summary of which systems successfully JOIN between equal-ish values of some different types.
| Type 1 | Type 2 | Postgres 17.5 |
MySQL 9.3.0 |
DuckDB v1.3.1 |
ClickHouse 25.6.2.5 |
SuperDB 47b1fb7 |
|---|---|---|---|---|---|---|
INTEGER |
REAL |
✅ | ✅ | ✅ | ✅ | ✅ |
INTEGER |
VARCHAR |
❌ | ✅ | ✅ | ❌ | ❌ |
TIMESTAMP |
DATE |
✅ | ✅ | ✅ | ✅ | N/A |
TIMESTMAP |
STRING |
❌ | ✅ | ✅ | ❌ | ❌ |
Test Outputs
Postgres
$ psql postgres
psql (17.5 (Homebrew))
Type "help" for help.
postgres=# CREATE TABLE foo (
int_val INTEGER,
real_val REAL,
str_int_val VARCHAR(255),
time_val TIMESTAMP,
date_val DATE,
str_time_val VARCHAR(255)
);
INSERT INTO foo VALUES (3, 3.0, '3', '2025-07-03 00:00:00', '2025-07-03', '2025-07-03 00:00:00');
CREATE TABLE bar (
int_val INTEGER,
real_val REAL,
str_int_val VARCHAR(255),
time_val TIMESTAMP,
date_val DATE,
str_time_val VARCHAR(255)
);
INSERT INTO bar VALUES (3, 3.0, '3', '2025-07-03 00:00:00', '2025-07-03', '2025-07-03 00:00:00');
CREATE TABLE
INSERT 0 1
CREATE TABLE
INSERT 0 1
postgres=# SELECT * FROM foo JOIN bar ON foo.int_val=bar.real_val;
int_val | real_val | str_int_val | time_val | date_val | str_time_val | int_val | real_val | str_int_val | time_val | date_val | str_time_val
---------+----------+-------------+---------------------+------------+---------------------+---------+----------+-------------+---------------------+------------+---------------------
3 | 3 | 3 | 2025-07-03 00:00:00 | 2025-07-03 | 2025-07-03 00:00:00 | 3 | 3 | 3 | 2025-07-03 00:00:00 | 2025-07-03 | 2025-07-03 00:00:00
(1 row)
postgres=# SELECT * FROM foo JOIN bar ON foo.int_val=bar.str_int_val;
ERROR: operator does not exist: integer = character varying
LINE 1: SELECT * FROM foo JOIN bar ON foo.int_val=bar.str_int_val;
^
HINT: No operator matches the given name and argument types. You might need to add explicit type casts.
postgres=# SELECT * FROM foo JOIN bar ON foo.time_val=bar.date_val;
int_val | real_val | str_int_val | time_val | date_val | str_time_val | int_val | real_val | str_int_val | time_val | date_val | str_time_val
---------+----------+-------------+---------------------+------------+---------------------+---------+----------+-------------+---------------------+------------+---------------------
3 | 3 | 3 | 2025-07-03 00:00:00 | 2025-07-03 | 2025-07-03 00:00:00 | 3 | 3 | 3 | 2025-07-03 00:00:00 | 2025-07-03 | 2025-07-03 00:00:00
(1 row)
postgres=# SELECT * FROM foo JOIN bar ON foo.time_val=bar.str_time_val;
ERROR: operator does not exist: timestamp without time zone = character varying
LINE 1: SELECT * FROM foo JOIN bar ON foo.time_val=bar.str_time_val;
^
HINT: No operator matches the given name and argument types. You might need to add explicit type casts.
MySQL
$ mysqlsh
MySQL Shell 9.3.0
Copyright (c) 2016, 2025, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.
Type '\help' or '\?' for help; '\quit' to exit.
MySQL SQL > \connect root@localhost
Creating a session to 'root@localhost'
Fetching global names for auto-completion... Press ^C to stop.
Your MySQL connection id is 19
Server version: 9.3.0 Homebrew
No default schema selected; type \use <schema> to set one.
MySQL localhost:3306 ssl SQL > \use foo
Default schema set to `foo`.
Fetching global names, object names from `foo` for auto-completion... Press ^C to stop.
MySQL localhost:3306 ssl foo SQL > CREATE TABLE foo (
-> int_val INTEGER,
-> real_val REAL,
-> str_int_val VARCHAR(255),
-> time_val TIMESTAMP,
-> date_val DATE,
-> str_time_val VARCHAR(255)
-> );
Query OK, 0 rows affected (0.0062 sec)
MySQL localhost:3306 ssl foo SQL >
MySQL localhost:3306 ssl foo SQL > INSERT INTO foo VALUES (3, 3.0, '3', '2025-07-03 00:00:00', '2025-07-03', '2025-07-03 00:00:00');
Query OK, 1 row affected (0.0013 sec)
MySQL localhost:3306 ssl foo SQL >
MySQL localhost:3306 ssl foo SQL > CREATE TABLE bar (
-> int_val INTEGER,
-> real_val REAL,
-> str_int_val VARCHAR(255),
-> time_val TIMESTAMP,
-> date_val DATE,
-> str_time_val VARCHAR(255)
-> );
Query OK, 0 rows affected (0.0054 sec)
MySQL localhost:3306 ssl foo SQL > INSERT INTO bar VALUES (3, 3.0, '3', '2025-07-03 00:00:00', '2025-07-03', '2025-07-03 00:00:00');
Query OK, 1 row affected (0.0015 sec)
MySQL localhost:3306 ssl foo SQL > SELECT * FROM foo JOIN bar ON foo.int_val=bar.real_val;
+---------+----------+-------------+---------------------+------------+---------------------+---------+----------+-------------+---------------------+------------+---------------------+
| int_val | real_val | str_int_val | time_val | date_val | str_time_val | int_val | real_val | str_int_val | time_val | date_val | str_time_val |
+---------+----------+-------------+---------------------+------------+---------------------+---------+----------+-------------+---------------------+------------+---------------------+
| 3 | 3 | 3 | 2025-07-03 00:00:00 | 2025-07-03 | 2025-07-03 00:00:00 | 3 | 3 | 3 | 2025-07-03 00:00:00 | 2025-07-03 | 2025-07-03 00:00:00 |
+---------+----------+-------------+---------------------+------------+---------------------+---------+----------+-------------+---------------------+------------+---------------------+
1 row in set (0.0004 sec)
MySQL localhost:3306 ssl foo SQL > SELECT * FROM foo JOIN bar ON foo.int_val=bar.str_int_val;
+---------+----------+-------------+---------------------+------------+---------------------+---------+----------+-------------+---------------------+------------+---------------------+
| int_val | real_val | str_int_val | time_val | date_val | str_time_val | int_val | real_val | str_int_val | time_val | date_val | str_time_val |
+---------+----------+-------------+---------------------+------------+---------------------+---------+----------+-------------+---------------------+------------+---------------------+
| 3 | 3 | 3 | 2025-07-03 00:00:00 | 2025-07-03 | 2025-07-03 00:00:00 | 3 | 3 | 3 | 2025-07-03 00:00:00 | 2025-07-03 | 2025-07-03 00:00:00 |
+---------+----------+-------------+---------------------+------------+---------------------+---------+----------+-------------+---------------------+------------+---------------------+
1 row in set (0.0005 sec)
MySQL localhost:3306 ssl foo SQL > SELECT * FROM foo JOIN bar ON foo.time_val=bar.date_val;
+---------+----------+-------------+---------------------+------------+---------------------+---------+----------+-------------+---------------------+------------+---------------------+
| int_val | real_val | str_int_val | time_val | date_val | str_time_val | int_val | real_val | str_int_val | time_val | date_val | str_time_val |
+---------+----------+-------------+---------------------+------------+---------------------+---------+----------+-------------+---------------------+------------+---------------------+
| 3 | 3 | 3 | 2025-07-03 00:00:00 | 2025-07-03 | 2025-07-03 00:00:00 | 3 | 3 | 3 | 2025-07-03 00:00:00 | 2025-07-03 | 2025-07-03 00:00:00 |
+---------+----------+-------------+---------------------+------------+---------------------+---------+----------+-------------+---------------------+------------+---------------------+
1 row in set (0.0004 sec)
MySQL localhost:3306 ssl foo SQL > SELECT * FROM foo JOIN bar ON foo.time_val=bar.str_time_val;
+---------+----------+-------------+---------------------+------------+---------------------+---------+----------+-------------+---------------------+------------+---------------------+
| int_val | real_val | str_int_val | time_val | date_val | str_time_val | int_val | real_val | str_int_val | time_val | date_val | str_time_val |
+---------+----------+-------------+---------------------+------------+---------------------+---------+----------+-------------+---------------------+------------+---------------------+
| 3 | 3 | 3 | 2025-07-03 00:00:00 | 2025-07-03 | 2025-07-03 00:00:00 | 3 | 3 | 3 | 2025-07-03 00:00:00 | 2025-07-03 | 2025-07-03 00:00:00 |
+---------+----------+-------------+---------------------+------------+---------------------+---------+----------+-------------+---------------------+------------+---------------------+
1 row in set (0.0004 sec)
DuckDB
$ duckdb foo.db
DuckDB v1.3.1 (Ossivalis) 2063dda3e6
Enter ".help" for usage hints.
D CREATE TABLE foo (
int_val INTEGER,
real_val REAL,
str_int_val VARCHAR(255),
time_val TIMESTAMP,
date_val DATE,
str_time_val VARCHAR(255)
);
D
D INSERT INTO foo VALUES (3, 3.0, '3', '2025-07-03 00:00:00', '2025-07-03', '2025-07-03 00:00:00');
D
D CREATE TABLE bar (
int_val INTEGER,
real_val REAL,
str_int_val VARCHAR(255),
time_val TIMESTAMP,
date_val DATE,
str_time_val VARCHAR(255)
);
D
D INSERT INTO bar VALUES (3, 3.0, '3', '2025-07-03 00:00:00', '2025-07-03', '2025-07-03 00:00:00');
D SELECT * FROM foo JOIN bar ON foo.int_val=bar.real_val;
┌─────────┬──────────┬─────────────┬─────────────────────┬────────────┬─────────────────────┬─────────┬──────────┬─────────────┬─────────────────────┬────────────┬─────────────────────┐
│ int_val │ real_val │ str_int_val │ time_val │ date_val │ str_time_val │ int_val │ real_val │ str_int_val │ time_val │ date_val │ str_time_val │
│ int32 │ float │ varchar │ timestamp │ date │ varchar │ int32 │ float │ varchar │ timestamp │ date │ varchar │
├─────────┼──────────┼─────────────┼─────────────────────┼────────────┼─────────────────────┼─────────┼──────────┼─────────────┼─────────────────────┼────────────┼─────────────────────┤
│ 3 │ 3.0 │ 3 │ 2025-07-03 00:00:00 │ 2025-07-03 │ 2025-07-03 00:00:00 │ 3 │ 3.0 │ 3 │ 2025-07-03 00:00:00 │ 2025-07-03 │ 2025-07-03 00:00:00 │
└─────────┴──────────┴─────────────┴─────────────────────┴────────────┴─────────────────────┴─────────┴──────────┴─────────────┴─────────────────────┴────────────┴─────────────────────┘
D SELECT * FROM foo JOIN bar ON foo.int_val=bar.str_int_val;
┌─────────┬──────────┬─────────────┬─────────────────────┬────────────┬─────────────────────┬─────────┬──────────┬─────────────┬─────────────────────┬────────────┬─────────────────────┐
│ int_val │ real_val │ str_int_val │ time_val │ date_val │ str_time_val │ int_val │ real_val │ str_int_val │ time_val │ date_val │ str_time_val │
│ int32 │ float │ varchar │ timestamp │ date │ varchar │ int32 │ float │ varchar │ timestamp │ date │ varchar │
├─────────┼──────────┼─────────────┼─────────────────────┼────────────┼─────────────────────┼─────────┼──────────┼─────────────┼─────────────────────┼────────────┼─────────────────────┤
│ 3 │ 3.0 │ 3 │ 2025-07-03 00:00:00 │ 2025-07-03 │ 2025-07-03 00:00:00 │ 3 │ 3.0 │ 3 │ 2025-07-03 00:00:00 │ 2025-07-03 │ 2025-07-03 00:00:00 │
└─────────┴──────────┴─────────────┴─────────────────────┴────────────┴─────────────────────┴─────────┴──────────┴─────────────┴─────────────────────┴────────────┴─────────────────────┘
D SELECT * FROM foo JOIN bar ON foo.time_val=bar.date_val;
┌─────────┬──────────┬─────────────┬─────────────────────┬────────────┬─────────────────────┬─────────┬──────────┬─────────────┬─────────────────────┬────────────┬─────────────────────┐
│ int_val │ real_val │ str_int_val │ time_val │ date_val │ str_time_val │ int_val │ real_val │ str_int_val │ time_val │ date_val │ str_time_val │
│ int32 │ float │ varchar │ timestamp │ date │ varchar │ int32 │ float │ varchar │ timestamp │ date │ varchar │
├─────────┼──────────┼─────────────┼─────────────────────┼────────────┼─────────────────────┼─────────┼──────────┼─────────────┼─────────────────────┼────────────┼─────────────────────┤
│ 3 │ 3.0 │ 3 │ 2025-07-03 00:00:00 │ 2025-07-03 │ 2025-07-03 00:00:00 │ 3 │ 3.0 │ 3 │ 2025-07-03 00:00:00 │ 2025-07-03 │ 2025-07-03 00:00:00 │
└─────────┴──────────┴─────────────┴─────────────────────┴────────────┴─────────────────────┴─────────┴──────────┴─────────────┴─────────────────────┴────────────┴─────────────────────┘
D SELECT * FROM foo JOIN bar ON foo.time_val=bar.str_time_val;
┌─────────┬──────────┬─────────────┬─────────────────────┬────────────┬─────────────────────┬─────────┬──────────┬─────────────┬─────────────────────┬────────────┬─────────────────────┐
│ int_val │ real_val │ str_int_val │ time_val │ date_val │ str_time_val │ int_val │ real_val │ str_int_val │ time_val │ date_val │ str_time_val │
│ int32 │ float │ varchar │ timestamp │ date │ varchar │ int32 │ float │ varchar │ timestamp │ date │ varchar │
├─────────┼──────────┼─────────────┼─────────────────────┼────────────┼─────────────────────┼─────────┼──────────┼─────────────┼─────────────────────┼────────────┼─────────────────────┤
│ 3 │ 3.0 │ 3 │ 2025-07-03 00:00:00 │ 2025-07-03 │ 2025-07-03 00:00:00 │ 3 │ 3.0 │ 3 │ 2025-07-03 00:00:00 │ 2025-07-03 │ 2025-07-03 00:00:00 │
└─────────┴──────────┴─────────────┴─────────────────────┴────────────┴─────────────────────┴─────────┴──────────┴─────────────┴─────────────────────┴────────────┴─────────────────────┘
ClickHouse
$ clickhouse
ClickHouse local version 25.6.2.5 (official build).
:) CREATE TABLE foo (
int_val INTEGER,
rCREATE TABLE foo (
int_val INTEGER,
real_val REAL,
str_int_val VARCHAR(255),
time_val TIMESTAMP,
date_val DATE,
str_time_val VARCHAR(255)
)
ENGINE = MergeTree
ORDER BY tuple();
INSERT INTO foo VALUES (3, 3.0, '3', '2025-07-03 00:00:00', '2025-07-03', '2025-07-03 00:00:00');
CREATE TABLE bar (
int_val INTEGER,
real_val REAL,
str_int_val VARCHAR(255),
time_val TIMESTAMP,
date_val DATE,
str_time_val VARCHAR(255)
)
ENGINE = MergeTree
ORDER BY tuple();
INSERT INTO bar VALUES (3, 3.0, '3', '2025-07-03 00:00:00', '2025-07-03', '2025-07-03 00:00:00');
CREATE TABLE foo
(
`int_val` INTEGER,
`real_val` REAL,
`str_int_val` VARCHAR(255),
`time_val` TIMESTAMP,
`date_val` DATE,
`str_time_val` VARCHAR(255)
)
ENGINE = MergeTree
ORDER BY tuple()
Query id: e3dc3611-e552-411a-91c5-54ceb192f3fd
Ok.
0 rows in set. Elapsed: 0.007 sec.
INSERT INTO foo FORMAT Values
Query id: a10c76cd-9c29-4779-ba70-ce8871bff944
Ok.
1 row in set. Elapsed: 0.004 sec.
CREATE TABLE bar
(
`int_val` INTEGER,
`real_val` REAL,
`str_int_val` VARCHAR(255),
`time_val` TIMESTAMP,
`date_val` DATE,
`str_time_val` VARCHAR(255)
)
ENGINE = MergeTree
ORDER BY tuple()
Query id: e5728578-bf10-40d2-9abe-3af7d673cb6e
Ok.
0 rows in set. Elapsed: 0.002 sec.
INSERT INTO bar FORMAT Values
Query id: 97d0b576-7004-455d-943b-3bbb8711334b
Ok.
1 row in set. Elapsed: 0.003 sec.
:) SELECT * FROM foo JOIN bar ON foo.int_val=bar.real_val;
SELECT *
FROM foo
INNER JOIN bar ON foo.int_val = bar.real_val
Query id: 2e32a95d-3fc4-43b1-8488-a106abf7f342
┌─int_val─┬─real_val─┬─str_int_val─┬────────────time_val─┬───date_val─┬─str_time_val────────┬─bar.int_val─┬─bar.real_val─┬─bar.str_int_val─┬────────bar.time_val─┬─bar.date_val─┬─bar.str_time_val────┐
1. │ 3 │ 3 │ 3 │ 2025-07-03 00:00:00 │ 2025-07-03 │ 2025-07-03 00:00:00 │ 3 │ 3 │ 3 │ 2025-07-03 00:00:00 │ 2025-07-03 │ 2025-07-03 00:00:00 │
└─────────┴──────────┴─────────────┴─────────────────────┴────────────┴─────────────────────┴─────────────┴──────────────┴─────────────────┴─────────────────────┴──────────────┴─────────────────────┘
1 row in set. Elapsed: 0.005 sec.
:) SELECT * FROM foo JOIN bar ON foo.int_val=bar.str_int_val;
SELECT *
FROM foo
INNER JOIN bar ON foo.int_val = bar.str_int_val
Query id: 3c85905a-7085-454f-af04-2ee1fec58157
Elapsed: 0.122 sec.
Received exception:
Code: 386. DB::Exception: There is no supertype for types Int32, String because some of them are String/FixedString/Enum and some of them are not: JOIN cannot infer common type in ON section for keys. Left key '__table1.int_val' type Int32. Right key '__table2.str_int_val' type String. (NO_COMMON_TYPE)
:) SELECT * FROM foo JOIN bar ON foo.time_val=bar.date_val;
SELECT *
FROM foo
INNER JOIN bar ON foo.time_val = bar.date_val
Query id: 3af5ed6e-23ad-4aeb-9cf6-397096484f3d
┌─int_val─┬─real_val─┬─str_int_val─┬────────────time_val─┬───date_val─┬─str_time_val────────┬─bar.int_val─┬─bar.real_val─┬─bar.str_int_val─┬────────bar.time_val─┬─bar.date_val─┬─bar.str_time_val────┐
1. │ 3 │ 3 │ 3 │ 2025-07-03 00:00:00 │ 2025-07-03 │ 2025-07-03 00:00:00 │ 3 │ 3 │ 3 │ 2025-07-03 00:00:00 │ 2025-07-03 │ 2025-07-03 00:00:00 │
└─────────┴──────────┴─────────────┴─────────────────────┴────────────┴─────────────────────┴─────────────┴──────────────┴─────────────────┴─────────────────────┴──────────────┴─────────────────────┘
1 row in set. Elapsed: 0.004 sec.
:) SELECT * FROM foo JOIN bar ON foo.time_val=bar.str_time_val;
SELECT *
FROM foo
INNER JOIN bar ON foo.time_val = bar.str_time_val
Query id: 174d3395-8d5b-4a0b-8f18-a3afc49ae88a
Elapsed: 0.001 sec.
Received exception:
Code: 386. DB::Exception: There is no supertype for types DateTime, String because some of them are String/FixedString/Enum and some of them are not: JOIN cannot infer common type in ON section for keys. Left key '__table1.time_val' type DateTime. Right key '__table2.str_time_val' type String. (NO_COMMON_TYPE)
SuperDB
SuperDB currently lacks a DATE type so that permutation is excluded. Also, due to the lack of support for static type checking (#5997) failure of the equality check in JOIN is currently communicated by lack of output rather than an explicit error message about mismatched types.
$ super -version
Version: 47b1fb7d8
$ cat foo.sup
{int_val:3, real_val:3.0, str_int_val:"3", time_val:2025-07-03T00:00:00Z, str_time_val:"2025-07-03T00:00:00Z"}
$ cat bar.sup
{int_val:3, real_val:3.0, str_int_val:"3", time_val:2025-07-03T00:00:00Z, str_time_val:"2025-07-03T00:00:00Z"}
$ super -c "SELECT * FROM foo.sup AS foo JOIN bar.sup AS bar ON foo.int_val=bar.real_val;"
{int_val:3,real_val:3.,str_int_val:"3",time_val:2025-07-03T00:00:00Z,str_time_val:"2025-07-03T00:00:00Z"}
$ super -c "SELECT * FROM foo.sup AS foo JOIN bar.sup AS bar ON foo.int_val=bar.str_int_val;"
[no output]
$ super -c "SELECT * FROM foo.sup AS foo JOIN bar.sup AS bar ON foo.time_val=bar.str_time_val;"
[no output]