Fuzzer issues 2
The issues
This is the follow-up to #4067. I will keep updating this issue with new issues from fuzzers I find.
If exists flag ignored for alter statements
ALTER TABLE IF EXISTS t0 RENAME TO t1;
ALTER SEQUENCE IF EXISTS seq OWNED BY x;
Error: Catalog Error: Table with name t0 does not exist!
The error shouldn't be thrown.
If not exists flag ignored for alter table add column statement
CREATE TABLE t0 (c0 INT);
ALTER TABLE t0 ADD COLUMN IF NOT EXISTS c0 INT;
Error: Catalog Error: Column with name c0 already exists!
The error shouldn't be thrown.
Generate series with NULL value (now fixed)
SELECT c0 FROM generate_series(NULL) t3(c0);
SELECT c0 FROM range(NULL) t3(c0);
Error: INTERNAL Error: Calling GetValue on a value that is NULL
Here maybe a non-internal error should be thrown. For scalar functions, NULL values are already handled separately. For table UDF ones, maybe this is missing, ie an optional check for any NULL input throw error.
Integer overflow while flattening dependent join
SELECT (SELECT c0 OFFSET 1) FROM (VALUES(1)) c0;
src/planner/subquery/flatten_dependent_join.cpp:397:90: runtime error: signed integer overflow: 1 + 9223372036854775807 cannot be represented in type 'long' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/planner/subquery/flatten_dependent_join.cpp:397:90 in
The query should output a single NULL value.
NOT SIMILAR TO at FILTER clause
DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("SELECT count(*) FILTER (WHERE 0 NOT SIMILAR TO '2' = FALSE);");
src/main/client_context.cpp:931: std::string duckdb::ClientContext::VerifyQuery(duckdb::ClientContextLock &, const std::string &, unique_ptrduckdb::SQLStatement): Assertion `orig_expr_list[i]->Equals(verify_statements[v_idx].select_list[i].get())' failed
I can only reproduce it via the C-API. I think the internal ToString call doesn't match the parsed query.
The query should output a single 0 value.
Escaped trim function call?
DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("SELECT \"trim\"(1);");
src/main/client_context.cpp:931: std::string duckdb::ClientContext::VerifyQuery(duckdb::ClientContextLock &, const std::string &, unique_ptrduckdb::SQLStatement): Assertion `orig_expr_list[i]->Equals(verify_statements[v_idx].select_list[i].get())' failed
Same issue as the previous one?
The query should output a single 1 value.
Partition by UTF-8 string
Because the query uses non-printable UTF-8 characters, I upload it in a zip file: window_issue.zip
/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1046:9: runtime error: reference binding to null pointer of type 'duckdb::RowDataBlock' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1046:9 in
The query should output 11 11 values,
Floating-point validation
DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("SELECT 0E6");
con.SendQuery("SELECT .0E7382504816742;");
src/main/client_context.cpp:931: std::string duckdb::ClientContext::VerifyQuery(duckdb::ClientContextLock &, const std::string &, unique_ptrduckdb::SQLStatement): Assertion `orig_expr_list[i]->Equals(verify_statements[v_idx].select_list[i].get())' failed
ToString method not matching again?
IS DISTINCT FROM validation
DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("SELECT (0 IS DISTINCT FROM 2) = 0;");
con.SendQuery("SELECT CASE 8 WHEN(0 IS DISTINCT FROM 0) THEN 2 END;");
src/main/client_context.cpp:931: std::string duckdb::ClientContext::VerifyQuery(duckdb::ClientContextLock &, const std::string &, unique_ptrduckdb::SQLStatement): Assertion `orig_expr_list[i]->Equals(verify_statements[v_idx].select_list[i].get())' failed
Another validation issue. A single false and NULL value is expected.
UndefinedBehaviorSanitizer on downcast from correlated subquery
CREATE TABLE t0(c0 INT);
SELECT count(*) OVER() = ANY(SELECT * FROM t0 t1 WHERE(c0 = t0.c0)) FROM t0;
src/optimizer/deliminator.cpp:38:19: runtime error: downcast of address 0x608000009120 which does not point to an object of type 'duckdb::BoundColumnRefExpression' 0x608000009120: note: object is of type 'duckdb::BoundCastExpression' 00 00 00 00 50 d0 ea 09 20 56 00 00 0c 1b be be be be be be 40 91 00 00 80 60 00 00 00 00 00 00 ^~~~~~~~~~~~~~~~~~~~~~~ vptr for 'duckdb::BoundCastExpression' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/optimizer/deliminator.cpp:38:19 in
An empty result set is expected.
Assertion error on correlated cube
CREATE TABLE t0(c0 INT);
INSERT INTO t0 VALUES(NULL);
SELECT (SELECT count(*) OVER() GROUP BY CUBE(c0)) FROM t0;
duckdb: src/execution/expression_executor.cpp:146: void duckdb::ExpressionExecutor::Execute(const duckdb::Expression &, duckdb::ExpressionState *, const duckdb::SelectionVector *, duckdb::idx_t, duckdb::Vector &): Assertion `FlatVector::Validity(result).CheckAllValid(count)' failed.
Because the cube returns more than one row, here the result may not be deterministic.
Limit 0% on ANY subquery
SELECT 1 WHERE 1 < ANY(SELECT 2 LIMIT 0%);
The query outputs 1, but because of LIMIT 0% I expect the ANY to be empty, then the predicate output should be empty.
Correlated offset subquery
SELECT (SELECT 1 OFFSET c0) FROM (VALUES(1)) c0;
src/execution/column_binding_resolver.cpp:74: virtual unique_ptrduckdb::Expression duckdb::ColumnBindingResolver::VisitReplace(duckdb::BoundColumnRefExpression &, unique_ptrduckdb::Expression *): Assertion `expr.depth == 0' failed.
The query should output NULL.
Missing error message at subquery
SELECT (WITH t2 AS (SELECT 3 WHERE count(*) FILTER (1)) SELECT 0 FROM t2);
src/execution/expression_executor.cpp:196: duckdb::idx_t duckdb::ExpressionExecutor::Select(const duckdb::Expression &, duckdb::ExpressionState *, const duckdb::SelectionVector *, duckdb::idx_t, duckdb::SelectionVector *, duckdb::SelectionVector *): Assertion `expr.return_type.id() == LogicalTypeId::BOOLEAN' failed.
This one is tricky, count(*) should bind in the inner query, then the error about aggregates in the WHERE clause should be thrown.
NaN as LIMIT % (now fixed)
SELECT 1 LIMIT CAST('NaN' AS REAL)%;
/src/execution/operator/helper/physical_limit_percent.cpp:121:18: runtime error: nan is outside the range of representable values of type 'unsigned long' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/execution/operator/helper/physical_limit_percent.cpp:121:18
A missing error here.
Simple multiplication issue
DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("SELECT 1 * (1 < 1);");
src/main/client_context.cpp:931: std::string duckdb::ClientContext::VerifyQuery(duckdb::ClientContextLock &, const std::string &, unique_ptrduckdb::SQLStatement): Assertion `orig_expr_list[i]->Equals(verify_statements[v_idx].select_list[i].get())' failed
I didn't have time to look at the details of this yet.
Duplicate table name at CTE
CREATE TABLE t0(c0 INT);
WITH t0 AS (SELECT 2) INSERT INTO t0 (WITH t0 AS (SELECT 2) SELECT 2);
Error: INTERNAL Error: Duplicate CTE "t0" in query!
This one maybe is mine. If we consider both CTEs to be at the same level, then the error shouldn't be internal.
Order by ALL with UNION query
(SELECT 2 ORDER BY(SELECT 2)) UNION SELECT 1 ORDER BY ALL;
src/planner/binder/query_node/bind_select_node.cpp:241: void duckdb::Binder::BindModifierTypes(duckdb::BoundQueryNode &, const vectorduckdb::LogicalType &, duckdb::idx_t): Assertion `bound_colref.binding.column_index < sql_types.size()' failed
The query should output the rows with values 1 and 2.
Date = int optimized vs non optimized
CREATE TABLE t0 (c0 INT);
PRAGMA enable_optimizer;
SELECT 1 FROM t0 WHERE DATE '2010-1-1' = 2;
-- Error: Conversion Error: Unimplemented type for cast (INTEGER -> DATE)
PRAGMA disable_optimizer;
SELECT 1 FROM t0 WHERE DATE '2010-1-1' = 2;
-- Empty result
This one is tricky. I would expect without any optimization the error to be thrown?
Another similar issue:
PRAGMA enable_optimizer;
VALUES (1),(INTERVAL '1' MICROSECONDS) LIMIT 0;
-- Empty result
PRAGMA disable_optimizer;
VALUES (1),(INTERVAL '1' MICROSECONDS) LIMIT 0;
-- Error: Conversion Error: Unimplemented type for cast (INTEGER -> INTERVAL)
Here it's reverted and what I expect, prune with optimizers enabled.
Another similar issue:
PRAGMA enable_optimizer;
SELECT 1 FROM (VALUES (1),(2),(NULL)) t0(c0) WHERE c0 BETWEEN 3 AND (CAST('inf' AS REAL) - 2);
-- Error: Out of Range Error: Overflow in subtraction of float!
PRAGMA disable_optimizer;
SELECT 1 FROM (VALUES (1),(2),(NULL)) t0(c0) WHERE c0 BETWEEN 3 AND (CAST('inf' AS REAL) - 2);
-- Empty result
This seems to be happening with runtime errors.
Select with forced parallelism
PRAGMA verify_parallelism;
CREATE TABLE t0 (c0 INT, c1 INT);
SELECT c1 FROM t0 WHERE (c0 + c1) = 2;
src/storage/data_table.cpp:333: bool duckdb::DataTable::NextParallelScan(duckdb::ClientContext &, duckdb::ParallelTableScanState &, duckdb::TableScanState &, const vectorduckdb::column_t &): Assertion `vector_index * STANDARD_VECTOR_SIZE < state.current_row_group->count' failed
My test machine has 32 cores, and I am using 1024 as the vector size.
The documentation still mentions the pragma as force_parallelism. It needs to be updated.
OFFSET query with wrong results
CREATE TABLE t0 (c0 INT, c1 INT);
INSERT INTO t0 (VALUES (1, 1),(2, 2),(3, 3));
SELECT c0 FROM t0 WHERE ((c0 + c1) = 2) OFFSET 10;
The query outputs 1, it should be empty instead.
OFFSET query taking too long
CREATE TABLE t0 (c0 INT, c1 INT);
INSERT INTO t0 (VALUES (1, 1),(2, 2),(3, 3));
SELECT c0 FROM t0 OFFSET 42949672960;
The SELECT query runs for a very long time. I would expect it to end very shortly.
While running this on the shell and sending the interrupt signal to it, I get: src/common/allocator.cpp:144: duckdb::AllocatorDebugInfo::~AllocatorDebugInfo(): Assertion `allocation_count == 0' failed
Maybe this is another unrelated issue.
heap-buffer-overflow with ART index
CREATE TABLE t0 (c0 INT AS (1), c1 INT);
CREATE INDEX i0 ON t0 USING ART ((c0 + c1));
With the address sanitizer, this results in a heap-buffer-overflow at src/common/types.cpp:36:17
NULL pointer on complex OFFSET clause
SELECT 6 OFFSET count(*) FILTER ((SELECT 2 UNION (SELECT 2) OFFSET (SELECT LAST))) OVER ();
src/planner/expression_binder/order_binder.cpp:39:61: runtime error: member call on null pointer of type 'std::vector<std::unique_ptrduckdb::ParsedExpression>' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/planner/expression_binder/order_binder.cpp:39:61
Analyze inexisting column
CREATE TABLE t0(c0 TINYINT);
ANALYZE t0(c4);
src/planner/binder/statement/bind_vacuum.cpp:35: duckdb::BoundStatement duckdb::Binder::Bind(duckdb::VacuumStatement &): Assertion `!result.HasError()' failed
A proper error should be thrown.
Analyze rowid column
CREATE TABLE t0(c0 INT);
ANALYZE t0(rowid);
src/storage/data_table.cpp:1319: void duckdb::DataTable::SetStatistics(duckdb::column_t, const std::function<void (BaseStatistics &)> &): Assertion `column_id != COLUMN_IDENTIFIER_ROW_ID' failed
A proper error should be thrown.
Alter table statements with rowid column
CREATE TABLE t0(c0 INTEGER, c1 INTEGER);
ALTER TABLE t0 DROP COLUMN rowid;
ALTER TABLE t0 RENAME rowid TO ups;
ALTER TABLE t0 ALTER rowid TYPE VARCHAR;
ALTER TABLE t0 ALTER rowid SET DEFAULT 0;
/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1046:34: runtime error: addition of unsigned offset to 0x611000049880 overflowed to 0x611000049818 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1046:34
A proper error should be thrown.
C-API invalid query
DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("INSERT INTO t0(VALUES(0 = (OVER = 2)));");
Terminates the process while throwing the error.
A proper error should be thrown.
In clause optimization error
CREATE TABLE t0(c0 INT);
SELECT 0 FROM t0 WHERE $0 IN (1);
src/optimizer/in_clause_rewriter.cpp:37: virtual unique_ptrduckdb::Expression duckdb::InClauseRewriter::VisitReplace(duckdb::BoundOperatorExpression &, unique_ptrduckdb::Expression *): Assertion `expr.children[i]->return_type == in_type' failed
A proper error should be thrown.
Lag window function issue
SELECT lag(1) OVER (ORDER BY 0 RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING);
src/include/duckdb/common/types/vector.hpp:298: static bool duckdb::FlatVector::IsNull(const duckdb::Vector &, duckdb::idx_t): Assertion `vector.GetVectorType() == VectorType::FLAT_VECTOR' failed
A single NULL value is expected.
Statistics propagation mistmatch
PRAGMA enable_verification;
SELECT CAST("sum"(1) AS BIGINT) FROM (SELECT CAST((SELECT EXISTS(SELECT 1)) AS BIGINT));
Error: INTERNAL Error: Statistics mismatch: value is smaller than min. Statistics: [Min: 42, Max: 42][Has Null: false, Has No Null: true][Approx Unique: 1] Vector: CONSTANT BIGINT: 1 = [ 1]
C-API IN query row index out chunk size
DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("SELECT(SELECT DATE '3764356-1-1' IN(1)) HAVING 0;");
src/common/types/column_data_collection.cpp:98: duckdb::Value duckdb::ColumnDataRow::GetValue(duckdb::idx_t) const: Assertion `row_index < chunk.size()' failed.
I would expect an unimplemented type for cast error.
Environment:
- OS: Linux
- DuckDB Version: Latest from the development master branch
- DuckDB Client: Shell and C-API
Identity Disclosure:
- Full Name: Pedro Ferreira
- Affiliation: Huawei
I've made progress on this in #4474, you can assign me to this issue @hannes
I am looking at Database collation not reloaded?
But I am having trouble reproducing the issue, AF is not a recognized collation name, so the first step already fails to complete:
CREATE TABLE t0 (c0 VARCHAR(0) COLLATE AF);
================================================================================
Catalog Error: Collation with name af does not exist!
Did you mean "nfc"?
When using a recognized COLLATION no error occurs, even after restart
I think I have the ICU extension enabled somehow. That's because AF shows on the collations list at PRAGMA collations;.
C-API IN query row index out chunk size
This is not actually a C-API specific issue, I can reproduce it with the unittester as well
# name: test/fuzzer/pedro/force_no_cross_product.test
# group: [pedro]
statement ok
PRAGMA enable_verification
query I
SELECT(SELECT DATE '3764356-1-1' IN(1)) HAVING 0;
----
The assertion that was being triggered is no longer triggered, but there are differences between the optimized/non-optimized result
SQL Query
SELECT(SELECT DATE '3764356-1-1' IN(1)) HAVING 0;
================================================================================
Actual result:
Invalid Error: Unoptimized statement differs from original result!
Original Result:
(SELECT (CAST('3764356-1-1' AS DATE) IN (1)))
BOOLEAN
[ Rows: 0]
Unoptimized:
Conversion Error: Unimplemented type for cast (INTEGER -> DATE)
I am noticing something strange in the capi today. Attempting to run a comment gives an assertion error:
src/common/types/column_data_collection.cpp:79: void duckdb::ColumnDataCollection::Initialize(vectorduckdb::LogicalType): Assertion `!types.empty()' failed
DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("--ups");
The issue in Floating-point validation is that the result is parsed back as decimal after the ToString roundtrip. It is not immediately apparent to me how to solve this elegantly... I generally think it doesn't really make sense to validate the query like this since ToString is not invertible, and we should probably remove it. https://github.com/duckdb/duckdb/blob/0d2d7930d2789405a0d07a15e37485fe70faee3e/src/verification/statement_verifier.cpp#L76-L81 The same issue is present with maps (which shares literal syntax with, and is parsed back as, structs)
statement ok
PRAGMA enable_verification;
query I
SELECT map([1, 5], ['a', 'e'])
----
{1=a, 5=e}
INTERNAL Error: Assertion triggered in file "[...]/src/verification/statement_verifier.cpp" on line 80: parsed_list[0]->Equals(select_list[i].get())
I suspect it will also break for any extension functions that return custom logical types aliases.
To help with coordination Im splitting this into multiple issues and will track them on this project. Please file separate issues if you find any more fuzzing bugs instead of updating this thread.
We should probably close this as well as soon as all the existing PR's referencing it are closed.
Sure I can create new issues from now on. One thing I mentioned on Discord is this issue:
SELECT e'\xF0\x9F\x98\x84' = '😄';
On PostgreSQL this gives true, while on DuckDB I get a parser error. I would expect both representations to match.
Sure I can create new issues from now on. One thing I mentioned on Discord is this issue:
SELECT e'\xF0\x9F\x98\x84' = '😄';On PostgreSQL this gives true, while on DuckDB I get a parser error. I would expect both representations to match.
We currently don't support the e'' syntax, so this is expected behaviour. Inconsistency with Postgres is not always a bug.
So what's the way to input UTF-8 strings using hex notation?
So what's the way to input UTF-8 strings using hex notation?
this works SELECT decode('\xF0\x9F\x98\x84'::BLOB) = '😄';
Ok, thanks
This has been converted to a bunch of different issues.