duckdb icon indicating copy to clipboard operation
duckdb copied to clipboard

Fuzzer issues 2

Open PedroTadim opened this issue 3 years ago • 0 comments

The issues

This is the follow-up to #4067. I will keep updating this issue with new issues from fuzzers I find.

If exists flag ignored for alter statements

ALTER TABLE IF EXISTS t0 RENAME TO t1;
ALTER SEQUENCE IF EXISTS seq OWNED BY x;

Error: Catalog Error: Table with name t0 does not exist!

The error shouldn't be thrown.

If not exists flag ignored for alter table add column statement

CREATE TABLE t0 (c0 INT);
ALTER TABLE t0 ADD COLUMN IF NOT EXISTS c0 INT;

Error: Catalog Error: Column with name c0 already exists!

The error shouldn't be thrown.

Generate series with NULL value (now fixed)

SELECT c0 FROM generate_series(NULL) t3(c0);
SELECT c0 FROM range(NULL) t3(c0);

Error: INTERNAL Error: Calling GetValue on a value that is NULL

Here maybe a non-internal error should be thrown. For scalar functions, NULL values are already handled separately. For table UDF ones, maybe this is missing, ie an optional check for any NULL input throw error.

Integer overflow while flattening dependent join

SELECT (SELECT c0 OFFSET 1) FROM (VALUES(1)) c0;

src/planner/subquery/flatten_dependent_join.cpp:397:90: runtime error: signed integer overflow: 1 + 9223372036854775807 cannot be represented in type 'long' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/planner/subquery/flatten_dependent_join.cpp:397:90 in

The query should output a single NULL value.

NOT SIMILAR TO at FILTER clause

DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("SELECT count(*) FILTER (WHERE 0 NOT SIMILAR TO '2' = FALSE);");

src/main/client_context.cpp:931: std::string duckdb::ClientContext::VerifyQuery(duckdb::ClientContextLock &, const std::string &, unique_ptrduckdb::SQLStatement): Assertion `orig_expr_list[i]->Equals(verify_statements[v_idx].select_list[i].get())' failed

I can only reproduce it via the C-API. I think the internal ToString call doesn't match the parsed query.

The query should output a single 0 value.

Escaped trim function call?

DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("SELECT \"trim\"(1);");

src/main/client_context.cpp:931: std::string duckdb::ClientContext::VerifyQuery(duckdb::ClientContextLock &, const std::string &, unique_ptrduckdb::SQLStatement): Assertion `orig_expr_list[i]->Equals(verify_statements[v_idx].select_list[i].get())' failed

Same issue as the previous one?

The query should output a single 1 value.

Partition by UTF-8 string

Because the query uses non-printable UTF-8 characters, I upload it in a zip file: window_issue.zip

/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1046:9: runtime error: reference binding to null pointer of type 'duckdb::RowDataBlock' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1046:9 in

The query should output 11 11 values,

Floating-point validation

DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("SELECT 0E6");
con.SendQuery("SELECT .0E7382504816742;");

src/main/client_context.cpp:931: std::string duckdb::ClientContext::VerifyQuery(duckdb::ClientContextLock &, const std::string &, unique_ptrduckdb::SQLStatement): Assertion `orig_expr_list[i]->Equals(verify_statements[v_idx].select_list[i].get())' failed

ToString method not matching again?

IS DISTINCT FROM validation

DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("SELECT (0 IS DISTINCT FROM 2) = 0;");
con.SendQuery("SELECT CASE 8 WHEN(0 IS DISTINCT FROM 0) THEN 2 END;");

src/main/client_context.cpp:931: std::string duckdb::ClientContext::VerifyQuery(duckdb::ClientContextLock &, const std::string &, unique_ptrduckdb::SQLStatement): Assertion `orig_expr_list[i]->Equals(verify_statements[v_idx].select_list[i].get())' failed

Another validation issue. A single false and NULL value is expected.

UndefinedBehaviorSanitizer on downcast from correlated subquery

CREATE TABLE t0(c0 INT);
SELECT count(*) OVER() = ANY(SELECT * FROM t0 t1 WHERE(c0 = t0.c0)) FROM t0;

src/optimizer/deliminator.cpp:38:19: runtime error: downcast of address 0x608000009120 which does not point to an object of type 'duckdb::BoundColumnRefExpression' 0x608000009120: note: object is of type 'duckdb::BoundCastExpression' 00 00 00 00 50 d0 ea 09 20 56 00 00 0c 1b be be be be be be 40 91 00 00 80 60 00 00 00 00 00 00 ^~~~~~~~~~~~~~~~~~~~~~~ vptr for 'duckdb::BoundCastExpression' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/optimizer/deliminator.cpp:38:19 in

An empty result set is expected.

Assertion error on correlated cube

CREATE TABLE t0(c0 INT);
INSERT INTO t0 VALUES(NULL);
SELECT (SELECT count(*) OVER() GROUP BY CUBE(c0)) FROM t0;

duckdb: src/execution/expression_executor.cpp:146: void duckdb::ExpressionExecutor::Execute(const duckdb::Expression &, duckdb::ExpressionState *, const duckdb::SelectionVector *, duckdb::idx_t, duckdb::Vector &): Assertion `FlatVector::Validity(result).CheckAllValid(count)' failed.

Because the cube returns more than one row, here the result may not be deterministic.

Limit 0% on ANY subquery

SELECT 1 WHERE 1 < ANY(SELECT 2 LIMIT 0%);

The query outputs 1, but because of LIMIT 0% I expect the ANY to be empty, then the predicate output should be empty.

Correlated offset subquery

SELECT (SELECT 1 OFFSET c0) FROM (VALUES(1)) c0;

src/execution/column_binding_resolver.cpp:74: virtual unique_ptrduckdb::Expression duckdb::ColumnBindingResolver::VisitReplace(duckdb::BoundColumnRefExpression &, unique_ptrduckdb::Expression *): Assertion `expr.depth == 0' failed.

The query should output NULL.

Missing error message at subquery

SELECT (WITH t2 AS (SELECT 3 WHERE count(*) FILTER (1)) SELECT 0 FROM t2);

src/execution/expression_executor.cpp:196: duckdb::idx_t duckdb::ExpressionExecutor::Select(const duckdb::Expression &, duckdb::ExpressionState *, const duckdb::SelectionVector *, duckdb::idx_t, duckdb::SelectionVector *, duckdb::SelectionVector *): Assertion `expr.return_type.id() == LogicalTypeId::BOOLEAN' failed.

This one is tricky, count(*) should bind in the inner query, then the error about aggregates in the WHERE clause should be thrown.

NaN as LIMIT % (now fixed)

SELECT 1 LIMIT CAST('NaN' AS REAL)%;

/src/execution/operator/helper/physical_limit_percent.cpp:121:18: runtime error: nan is outside the range of representable values of type 'unsigned long' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/execution/operator/helper/physical_limit_percent.cpp:121:18

A missing error here.

Simple multiplication issue

DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("SELECT 1 * (1 < 1);");

src/main/client_context.cpp:931: std::string duckdb::ClientContext::VerifyQuery(duckdb::ClientContextLock &, const std::string &, unique_ptrduckdb::SQLStatement): Assertion `orig_expr_list[i]->Equals(verify_statements[v_idx].select_list[i].get())' failed

I didn't have time to look at the details of this yet.

Duplicate table name at CTE

CREATE TABLE t0(c0 INT);
WITH t0 AS (SELECT 2) INSERT INTO t0 (WITH t0 AS (SELECT 2) SELECT 2);

Error: INTERNAL Error: Duplicate CTE "t0" in query!

This one maybe is mine. If we consider both CTEs to be at the same level, then the error shouldn't be internal.

Order by ALL with UNION query

(SELECT 2 ORDER BY(SELECT 2)) UNION SELECT 1 ORDER BY ALL;

src/planner/binder/query_node/bind_select_node.cpp:241: void duckdb::Binder::BindModifierTypes(duckdb::BoundQueryNode &, const vectorduckdb::LogicalType &, duckdb::idx_t): Assertion `bound_colref.binding.column_index < sql_types.size()' failed

The query should output the rows with values 1 and 2.

Date = int optimized vs non optimized

CREATE TABLE t0 (c0 INT);
PRAGMA enable_optimizer;
SELECT 1 FROM t0 WHERE DATE '2010-1-1' = 2;
-- Error: Conversion Error: Unimplemented type for cast (INTEGER -> DATE)
PRAGMA disable_optimizer;
SELECT 1 FROM t0 WHERE DATE '2010-1-1' = 2;
-- Empty result

This one is tricky. I would expect without any optimization the error to be thrown?

Another similar issue:

PRAGMA enable_optimizer;
VALUES (1),(INTERVAL '1' MICROSECONDS) LIMIT 0;
-- Empty result
PRAGMA disable_optimizer;
VALUES (1),(INTERVAL '1' MICROSECONDS) LIMIT 0;
-- Error: Conversion Error: Unimplemented type for cast (INTEGER -> INTERVAL)

Here it's reverted and what I expect, prune with optimizers enabled.

Another similar issue:

PRAGMA enable_optimizer;
SELECT 1 FROM (VALUES (1),(2),(NULL)) t0(c0) WHERE c0 BETWEEN 3 AND (CAST('inf' AS REAL) - 2);
-- Error: Out of Range Error: Overflow in subtraction of float!
PRAGMA disable_optimizer;
SELECT 1 FROM (VALUES (1),(2),(NULL)) t0(c0) WHERE c0 BETWEEN 3 AND (CAST('inf' AS REAL) - 2);
-- Empty result

This seems to be happening with runtime errors.

Select with forced parallelism

PRAGMA verify_parallelism;
CREATE TABLE t0 (c0 INT, c1 INT);
SELECT c1 FROM t0 WHERE (c0 + c1) = 2;

src/storage/data_table.cpp:333: bool duckdb::DataTable::NextParallelScan(duckdb::ClientContext &, duckdb::ParallelTableScanState &, duckdb::TableScanState &, const vectorduckdb::column_t &): Assertion `vector_index * STANDARD_VECTOR_SIZE < state.current_row_group->count' failed

My test machine has 32 cores, and I am using 1024 as the vector size.

The documentation still mentions the pragma as force_parallelism. It needs to be updated.

OFFSET query with wrong results

CREATE TABLE t0 (c0 INT, c1 INT);
INSERT INTO t0 (VALUES (1, 1),(2, 2),(3, 3));
SELECT c0 FROM t0 WHERE ((c0 + c1) = 2) OFFSET 10;

The query outputs 1, it should be empty instead.

OFFSET query taking too long

CREATE TABLE t0 (c0 INT, c1 INT);
INSERT INTO t0 (VALUES (1, 1),(2, 2),(3, 3));
SELECT c0 FROM t0 OFFSET 42949672960;

The SELECT query runs for a very long time. I would expect it to end very shortly.

While running this on the shell and sending the interrupt signal to it, I get: src/common/allocator.cpp:144: duckdb::AllocatorDebugInfo::~AllocatorDebugInfo(): Assertion `allocation_count == 0' failed

Maybe this is another unrelated issue.

heap-buffer-overflow with ART index

CREATE TABLE t0 (c0 INT AS (1), c1 INT);
CREATE INDEX i0 ON t0 USING ART ((c0 + c1));

With the address sanitizer, this results in a heap-buffer-overflow at src/common/types.cpp:36:17

NULL pointer on complex OFFSET clause

SELECT 6 OFFSET count(*) FILTER ((SELECT 2 UNION (SELECT 2) OFFSET (SELECT LAST))) OVER ();

src/planner/expression_binder/order_binder.cpp:39:61: runtime error: member call on null pointer of type 'std::vector<std::unique_ptrduckdb::ParsedExpression>' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/planner/expression_binder/order_binder.cpp:39:61

Analyze inexisting column

CREATE TABLE t0(c0 TINYINT);
ANALYZE t0(c4);

src/planner/binder/statement/bind_vacuum.cpp:35: duckdb::BoundStatement duckdb::Binder::Bind(duckdb::VacuumStatement &): Assertion `!result.HasError()' failed

A proper error should be thrown.

Analyze rowid column

CREATE TABLE t0(c0 INT);
ANALYZE t0(rowid);

src/storage/data_table.cpp:1319: void duckdb::DataTable::SetStatistics(duckdb::column_t, const std::function<void (BaseStatistics &)> &): Assertion `column_id != COLUMN_IDENTIFIER_ROW_ID' failed

A proper error should be thrown.

Alter table statements with rowid column

CREATE TABLE t0(c0 INTEGER, c1 INTEGER);
ALTER TABLE t0 DROP COLUMN rowid;
ALTER TABLE t0 RENAME rowid TO ups;
ALTER TABLE t0 ALTER rowid TYPE VARCHAR;
ALTER TABLE t0 ALTER rowid SET DEFAULT 0;

/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1046:34: runtime error: addition of unsigned offset to 0x611000049880 overflowed to 0x611000049818 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1046:34

A proper error should be thrown.

C-API invalid query

DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("INSERT INTO t0(VALUES(0 = (OVER = 2)));");

Terminates the process while throwing the error.

A proper error should be thrown.

In clause optimization error

CREATE TABLE t0(c0 INT);
SELECT 0 FROM t0 WHERE $0 IN (1);

src/optimizer/in_clause_rewriter.cpp:37: virtual unique_ptrduckdb::Expression duckdb::InClauseRewriter::VisitReplace(duckdb::BoundOperatorExpression &, unique_ptrduckdb::Expression *): Assertion `expr.children[i]->return_type == in_type' failed

A proper error should be thrown.

Lag window function issue

SELECT lag(1) OVER (ORDER BY 0 RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING);

src/include/duckdb/common/types/vector.hpp:298: static bool duckdb::FlatVector::IsNull(const duckdb::Vector &, duckdb::idx_t): Assertion `vector.GetVectorType() == VectorType::FLAT_VECTOR' failed

A single NULL value is expected.

Statistics propagation mistmatch

PRAGMA enable_verification;
SELECT CAST("sum"(1) AS BIGINT) FROM (SELECT CAST((SELECT EXISTS(SELECT 1)) AS BIGINT));

Error: INTERNAL Error: Statistics mismatch: value is smaller than min. Statistics: [Min: 42, Max: 42][Has Null: false, Has No Null: true][Approx Unique: 1] Vector: CONSTANT BIGINT: 1 = [ 1]

C-API IN query row index out chunk size

DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("SELECT(SELECT DATE '3764356-1-1' IN(1)) HAVING 0;");

src/common/types/column_data_collection.cpp:98: duckdb::Value duckdb::ColumnDataRow::GetValue(duckdb::idx_t) const: Assertion `row_index < chunk.size()' failed.

I would expect an unimplemented type for cast error.

Environment:

  • OS: Linux
  • DuckDB Version: Latest from the development master branch
  • DuckDB Client: Shell and C-API

Identity Disclosure:

  • Full Name: Pedro Ferreira
  • Affiliation: Huawei

PedroTadim avatar Jul 18 '22 08:07 PedroTadim

I've made progress on this in #4474, you can assign me to this issue @hannes

Maxxen avatar Aug 23 '22 13:08 Maxxen

I am looking at Database collation not reloaded? But I am having trouble reproducing the issue, AF is not a recognized collation name, so the first step already fails to complete:

CREATE TABLE t0 (c0 VARCHAR(0) COLLATE AF);
================================================================================
Catalog Error: Collation with name af does not exist!
Did you mean "nfc"?

When using a recognized COLLATION no error occurs, even after restart

Tishj avatar Aug 25 '22 07:08 Tishj

I think I have the ICU extension enabled somehow. That's because AF shows on the collations list at PRAGMA collations;.

PedroTadim avatar Aug 25 '22 07:08 PedroTadim

C-API IN query row index out chunk size This is not actually a C-API specific issue, I can reproduce it with the unittester as well

# name: test/fuzzer/pedro/force_no_cross_product.test
# group: [pedro]

statement ok
PRAGMA enable_verification

query I
SELECT(SELECT DATE '3764356-1-1' IN(1)) HAVING 0;
----

The assertion that was being triggered is no longer triggered, but there are differences between the optimized/non-optimized result

SQL Query
SELECT(SELECT DATE '3764356-1-1' IN(1)) HAVING 0;
================================================================================
Actual result:
Invalid Error: Unoptimized statement differs from original result!
Original Result:
(SELECT (CAST('3764356-1-1' AS DATE) IN (1)))
BOOLEAN
[ Rows: 0]

Unoptimized:
Conversion Error: Unimplemented type for cast (INTEGER -> DATE)

Tishj avatar Aug 25 '22 08:08 Tishj

I am noticing something strange in the capi today. Attempting to run a comment gives an assertion error:

src/common/types/column_data_collection.cpp:79: void duckdb::ColumnDataCollection::Initialize(vectorduckdb::LogicalType): Assertion `!types.empty()' failed

DuckDB db(nullptr);
Connection con(db);
con.EnableQueryVerification();
con.SendQuery("--ups");

PedroTadim avatar Aug 29 '22 08:08 PedroTadim

The issue in Floating-point validation is that the result is parsed back as decimal after the ToString roundtrip. It is not immediately apparent to me how to solve this elegantly... I generally think it doesn't really make sense to validate the query like this since ToString is not invertible, and we should probably remove it. https://github.com/duckdb/duckdb/blob/0d2d7930d2789405a0d07a15e37485fe70faee3e/src/verification/statement_verifier.cpp#L76-L81 The same issue is present with maps (which shares literal syntax with, and is parsed back as, structs)

statement ok
PRAGMA enable_verification;

query I
SELECT map([1, 5], ['a', 'e'])
----
{1=a, 5=e}
INTERNAL Error: Assertion triggered in file "[...]/src/verification/statement_verifier.cpp" on line 80: parsed_list[0]->Equals(select_list[i].get())

I suspect it will also break for any extension functions that return custom logical types aliases.

Maxxen avatar Sep 01 '22 14:09 Maxxen

To help with coordination Im splitting this into multiple issues and will track them on this project. Please file separate issues if you find any more fuzzing bugs instead of updating this thread.

We should probably close this as well as soon as all the existing PR's referencing it are closed.

Maxxen avatar Sep 02 '22 11:09 Maxxen

Sure I can create new issues from now on. One thing I mentioned on Discord is this issue: SELECT e'\xF0\x9F\x98\x84' = '😄'; On PostgreSQL this gives true, while on DuckDB I get a parser error. I would expect both representations to match.

PedroTadim avatar Sep 02 '22 12:09 PedroTadim

Sure I can create new issues from now on. One thing I mentioned on Discord is this issue: SELECT e'\xF0\x9F\x98\x84' = '😄'; On PostgreSQL this gives true, while on DuckDB I get a parser error. I would expect both representations to match.

We currently don't support the e'' syntax, so this is expected behaviour. Inconsistency with Postgres is not always a bug.

hannes avatar Sep 02 '22 14:09 hannes

So what's the way to input UTF-8 strings using hex notation?

PedroTadim avatar Sep 02 '22 14:09 PedroTadim

So what's the way to input UTF-8 strings using hex notation?

this works SELECT decode('\xF0\x9F\x98\x84'::BLOB) = '😄';

hannes avatar Sep 02 '22 14:09 hannes

Ok, thanks

PedroTadim avatar Sep 02 '22 14:09 PedroTadim

This has been converted to a bunch of different issues.

pdet avatar Sep 14 '22 09:09 pdet