Bug Report: Joining too many reference tables causes panic
Overview of the Issue
I get a panic when I join too many reference tables. Initially I thought it was related to global routing, but it seems to relate to finding the proper routing for the reference tables.
It may also relate to #15777, but not sure.
As far as expected results go, that should be clear. But I suppose also adding an additional automated test, with more reference tables, and different kind of addressing modes (with keyspace, without, etc etc)?
First head of the stack trace:
apr 25 06:05:07 vitess-gate-a1 start_vtgate[610]: E0425 06:05:07.204099 610 server.go:373] mysql_server caught panic:
apr 25 06:05:07 vitess-gate-a1 start_vtgate[610]: runtime error: invalid memory address or nil pointer dereference
apr 25 06:05:07 vitess-gate-a1 start_vtgate[610]: runtime/debug.Stack()
apr 25 06:05:07 vitess-gate-a1 start_vtgate[610]: runtime/debug/stack.go:24 +0x5e
apr 25 06:05:07 vitess-gate-a1 start_vtgate[610]: vitess.io/vitess/go/cache/theine.newPanicError({0x1a54560, 0x3554c10})
apr 25 06:05:07 vitess-gate-a1 start_vtgate[610]: vitess.io/vitess/go/cache/theine/singleflight.go:52 +0x25
apr 25 06:05:07 vitess-gate-a1 start_vtgate[610]: vitess.io/vitess/go/cache/theine.(*Group[...]).doCall.func2.1()
apr 25 06:05:07 vitess-gate-a1 start_vtgate[610]: vitess.io/vitess/go/cache/theine/singleflight.go:184 +0x34
apr 25 06:05:07 vitess-gate-a1 start_vtgate[610]: panic({0x1a54560?, 0x3554c10?})
apr 25 06:05:07 vitess-gate-a1 start_vtgate[610]: runtime/panic.go:770 +0x132
Reproduction Steps
In this example, I call mysql legacy < crash-vtgate.sql, so legacy is the default database.
# mysql legacy < crash-vtgate.sql
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
It contains:
vexplain queries
SELECT UNIX_TIMESTAMP()
-- sharded
FROM sites2023.alarmLog al
-- sharded
INNER JOIN sites2023.lastLogData as lld
ON lld.idSite = al.idSite
-- First reference table
INNER JOIN dataAttributes as da
ON lld.idDataAttribute = da.idDataAttribute
-- Adding this one is the culprint. But, I can refer to any other reference table and it also panics.
inner join deviceTypes as dt
on dt.idDeviceType = da.idDeviceType
where lld.idSite = 219211
The vschema for sites2023 is:
{
"sharded": true,
"require_explicit_routing": true,
"vindexes": {
"a_standard_hash": {
"type": "hash"
}
},
"tables": {
"deviceTypes": {
"type": "reference",
"source": "legacy.deviceTypes"
},
"dataAttributes": {
"type": "reference",
"source": "legacy.dataAttributes"
},
"lastLogData": {
"column_vindexes": [
{
"column": "idSite",
"name": "a_standard_hash"
}
]
},
"alarmLog": {
"column_vindexes": [
{
"column": "idSite",
"name": "a_standard_hash"
}
],
"auto_increment" : {
"column": "idAlarm",
"sequence": "alarmLog_seq"
}
}
}
}
The simplified one for legacy is:
{
"sharded": false,
"tables": {
"deviceTypes": {},
"dataAttributes": {}
}
}
There is a bit of a caveat there: issue #15777 shows that seemingly unrelated extra reference tables defined in the sharded keyspace, breaks the routing. It may also affect when the panic happens?
Binary Version
vtgate version Version: 19.0.3 (Git revision cb5464edf5d7075feae744f3580f8bc626d185aa branch 'HEAD') built on Thu Apr 4 12:18:41 UTC 2024 by runner@fv-az1543-228 using go1.22.2 linux/amd64
Operating System and Environment details
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
Linux vitess-gate-a1 6.5.0-1018-aws #18~22.04.1-Ubuntu SMP Fri Apr 5 17:44:33 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Log Fragments
https://gist.github.com/wiebeytec/83f44bf7ae0858e21ba9ffd745b0fb2b