sqllogictest-rs
sqllogictest-rs copied to clipboard
idea: add syntax for session
Use case: a user wants to setup a dataset (e.g., tpch), then tests different set of queries. What's more, they want to test it under different session configurations. e.g., https://github.com/singularity-data/risingwave/issues/3629#issuecomment-1174502552
Previously, they will:
# A.slt
include prepare.slt.part
set CONFIG=A;
include test1.slt.part
include test2.slt.part
# B.slt
include prepare.slt.part
set CONFIG=B;
include test1.slt.part
include test2.slt.part
The biggest problem is that the tests cannot be parallized.
I came up with another syntax:
include prepare.slt.part
session {
statement ok
set CONFIG=A;
include test1.slt.part
}
session {
statement ok
set CONFIG=A;
include test2.slt.part
}
session {
statement ok
set CONFIG=B;
include test1.slt.part
}
session {
statement ok
set CONFIG=B;
include test2.slt.part
}
In this way, we can have:
- parallism
- isolated session config. This also solves another problem: config side-effects in a subtest
.slt.part
https://github.com/risinglightdb/sqllogictest-rs/issues/55 - isolated failure
- reuse test cases
session
can be alternatived called as e.g., subtest
or run
.
I strongly recommend to keep things simple 😇 How about just use one session for one .slt
file? Users can split tests into multiple files if necessary.
I strongly recommend to make things simple... How about just use one session for one file?
They are different problems. #85
So let's do https://github.com/risinglightdb/sqllogictest-rs/issues/85 first?
Users can split tests into multiple files if necessary.
include
's semantic is just copy-paste.That sounds intuitive to me (and other users, I think). I don't think you need do anything else for it.
Yes, but when we want to split tests, we failed to do so because we also want to reuse tests...
And test cases should be as short as possible. I believe current RisingWave’s batch test is a bad example. Include should be used to include things in common, instead of including new test cases. This is where the confusion originates: each test case share a different session level setting. By the way, parallelism is exploited on file level, which means that sqllogictest cannot parallelize RisingWave’s batch test.
Originally posted by @skyzh in https://github.com/risinglightdb/sqllogictest-rs/issues/55#issuecomment-1172832193
Originally posted by @skyzh in https://github.com/singularity-data/risingwave/issues/3629#issuecomment-1173869684
The reason we include test cases is that we want to test both cases for local and distributed mode. I'm not aware of better ways to avoid much code duplication. Any suggestion is welcome.
Originally posted by @liurenjie1024 in https://github.com/singularity-data/risingwave/issues/3629#issuecomment-1174502552
So this issue proposes another thing (which describes subtests, instead of general copy-paste) other than include
for this use case. To be honest, I also think it's bit of complicated, but can't come up with another way 😇. Anyway it's just an idea.
Risingwave batch local_mode.slt
and distribution_mode.slt
includes ALL subtests. It seems that we can't have any improvements without additional mechanism 🤪
crdb's sqllogictest has a subtest
https://github.com/cockroachdb/cockroach/blob/82e2a30c5be8046dead49348722a2429d5c29290/pkg/sql/logictest/logic.go#L377-L380
It's like
subtest a
...
subtest end (optional)
subtest b
...
BTW they don't have include
Hey there! I'd like to share an idea that could potentially help you achieve side-effect isolation.
In Datafusion we implemented a runner and a Postgres client, that assume:
- there is one connection per test file - this is the same as in the
sqllogictest-rs
Postgres implementations. - each connection overrides the Postgres'
search_path
.search_path
tells in which schemas to find database objects. The schema name is based on the test file name.
Having a schema per unit of test execution (a file name, a session block, etc.) would at least prevent parallel tests from trying to perform DDL operations on the same db objects.