quilkin
quilkin copied to clipboard
Create More Complete End-to-End Testing Framework
Currently our tests are mostly unit tests, and they touch too much of the API surface that when those other parts change it requires updating a lot of the unit tests making these tests quite brittle. One example of this are the tests for metrics, such as some of the session metrics like measuring bytes. These tests manually construct sessions in their tests, but this isn't needed because those details aren't relevant to what we're testing, we'd get the same value from just sending a request and then looking at the prometheus' statistics after the fact.
I want to expand beyond our current ad-hoc helper functions in test-utils
and have something that is an easier, more declarative way for our network tests, that requires less boilerplate, and only touches the API surface that it tests. In terms of what the output of this would be, I'm not sure yet, but think we should at least start with what about Quilkin's behaviour do we want to test, and what do we want to ensure is correct?
Some starting examples:
- Metrics being sent and being correct.
- Packets completing a full RTT.
- Packets being dropped when conditions are met.
- Filter's specific behaviour.
100% agree. This has also been on my mind as of late.
I definitely also want to make sure our built container does not crash in it's default setup (although it won't be able to process packets), and that it will also be able to accept a config file, and process packets with that config file.
So for our end-to-end tests, another thing that came to mind is that, people who write filters also are also going need a way to reliably test their filters and configs, and it would be nice to remove Rust as a requirement for the tests, since not all of our users will be Rust developers. So I was thinking what if we add a new subcommand (quilkin test
) that runs a single RTT of a configuration, so you can be confident that everything at least works. In addition we'd add a test:
key to the root of the config, that provides some test specific config data such as input.
ooh quilkin test
is an interesting idea! I usually use nc
and ncat
to do this manually, but (a) not everyone is on *nix and (b) it can be annoying to setup.
You mention I think two users here, so I wanted to explore that further:
- The user that is writing a config, and wants to test the config against a released of quilkin.
- The user that writes their own Filter and wants to test the Filter, while writing the least amount of Rust they have to (maybe they aren't comfortable, or are exploring a solution).
I feel like quilkin test
would be extremely useful for No. 1, as well as No. 2 -- so we might want to think through both as seperate end users, and make sure we can cover both scenarios.
I would posit that No. 2 will eventually feel more comfortable with Rust as time goes on, and our e2e tooling will probably work for them, but having a lower onrap for bringing people on is always a good thing in my book.
I feel like quilkin test would be extremely useful for No. 1, as well as No. 2 -- so we might want to think through both as seperate end users, and make sure we can cover both scenarios.
Definitely.
I would posit that No. 2 will eventually feel more comfortable with Rust as time goes on, and our e2e tooling will probably work for them, but having a lower onrap for bringing people on is always a good thing in my book.
Yeah, also important to keep in mind that custom filter writers also have access to Rust for testing, so they can unit test their library's behaviour, and quilkin test
would be more of an integration style test.
Is it weird for a proxy binary to have a test command (are there any tools/systems that do something similar)? with quilkin test
it feels that way to me but I'm thinking maybe because I don't really know what this entails - I'm assuming whoever writes a custom filter in rust can unit test their code then its definitely nice to have a reusable e2e test I would've thought similar to how quilkin would also do e2e tests, that this would be done external to quilkin itself.
Is it weird for a proxy binary to have a test command (are there any tools/systems that do something similar)?
Yes, nginx
has the -t
flag that tests the configuration.
-t : test configuration and exit
-T : test configuration, dump it and exit
I'm assuming whoever writes a custom filter in rust can unit test their code.
Yes, but this isn't just for people writing their own custom filters, it's for anyone who wants to check if a Quilkin configuration is valid, and those people can't or won't always be writing Rust code. You should be able to test that things work without code IMO. This makes our tests a lot easier, provides a nicer dev experience, and users could also use this in their CI to make sure their checked in configuration is always valid.
Having two separate paths quilkin-run
and quilkin-test
we can also provide different log levels by default to show more debug information while you're testing a config that you wouldn't want to show when running in production, providing a nicer edit-test-debug cycle.
I see, but nginx -t
isn't similar to what's being proposed with quilkin test
? that only checks that the config is valid (which quilkin already always does).
My understanding is we rather want quilkin test
to take a config, start the proxy and (potentially multiple?) udp servers, run some input packets through them, then do some assertions? That sounds more like a test framework? If so then that's what I think belongs outside of the quilkin binary (and curious as to what other systems do this). I get that its nice to have a ready made tool for users to test with, my concern is how/why that belongs in the quilkin binary
I think let's start with some use cases - what it is we would want to do - i.e. what do you need to manually test a filter.
This is what I usually have when I do this:
- A UDP server that will echo back packets (ideally, be able to print out packets that are received and sent back, so you can see what is happening)
- (Similar to
nc -u
) an interactive way to write the text of a packet that is going to be sent (may not work for all types of scenarios, but is probably a good start), and then what packet is recieved.
I'm basically thinking, a platform agnostic version of our ncat/netcat quickstart - which is currently tied to *nix.
I'd suggest, let's start simple - how do we test that a packet goes to an endpoint, and gets returned to sender (let's not worry about Filters yet).
What would that look like?
I see two possible paths:
Everything happens in one process, something like:
$ quilkin -t -f config.yaml # starts quilkin in test mode with a standard config. Validates config, as per normal.
Starting Quilkin in test mode!
.... # standard quilkin output logging here
Opening interactive shell:
Send Packet: | # interactive shell
Send Packet: test # write "test" hit enter, it sends a package with "test" as bytes the process of quilkin that is already started.
Echo: Received the packet "test", sending back!
Received Packet! "test" ✔️
Requires two separate processes.
In this mode quilkin -t
will fire up the echo server and the interactive terminal, but the user would require to setup a seperate quilkin instance to connect to.
This would enable you to setup multiple layers of quilkin instances if you wanted (client/server for example) - but would be harder to get started with.
Have the option to choose
Initial version is a single process, but we provide a flag option to run Quilkin in a separate process if you so desire?
Opinion" I would probably start with a single process to get started, and then we can expand from there depending on what people need?
Question Regardless of which version we pick, I'm trying to work out how we specify where the UDP endpoint is going to be. Maybe that needs to also be an argument? (maybe you can have multiple echo endpoints if you want them?)
Other thoughts Probably easier once we do #317 , it might be good to output debug logging out of quilkin proxy (or provide the option for logging level in either the command line or config yaml).
Also, we should add emoji to the output 😄 🦔
How does that align with everyone's expectations?
Also as a note: I think this issue has merged two things that don't necessarily overlap:
- We need better e2e testing across our container builds, and our binary builds. For example, I missed in review in #350 there was a unrelated to the PR change from
-f/--filename
to-c/--config
. This broke all our docs as well as our container images. This needs to be caught in CI in some way shape or form. - How do we build good tooling for our end user developers to be able to have a good onboarding experience (e.g.
quilkin test
) which may or may not aid us in our overall e2e testing experience in CI.
Should we split these things into seperate tickets? Given that this ticket has gone deep into quikin test
- I recommend that we rename this to quilkin test
, and I'll start a new ticket with a checklist of items of improvements to CI.
Sound good?
I see, but nginx -t isn't similar to what's being proposed with quilkin test? that only checks that the config is valid (which quilkin already always does).
Yes, but quilkin-run
(what I'm calling the default mode) isn't suitable in CI environments. I want to be able to add checks to our CI that ensure that all Quilkin configs are valid.
My understanding is we rather want quilkin test to take a config, start the proxy and (potentially multiple?) udp servers, run some input packets through them, then do some assertions? That sounds more like a test framework? If so then that's what I think belongs outside of the quilkin binary (and curious as to what other systems do this). I get that its nice to have a ready made tool for users to test with, my concern is how/why that belongs in the quilkin binary
Well it sounds like a test framework because it is, but to be clear I'm not interested in providing a general UDP testing framework (at least not to start), the focus is on being able to valid and test the behaviour of a config. That would mean testing thing like packet contents, but it would also include checking metrics and other Quilkin specific behaviour. The existing general UDP testing space is pretty bad, the most popular tool I know of is TTCN-3, and we don't want to use that.
The reason to include it in the binary is because we want to be able test configurations outside the repository, and I feel that the only thing you should test Quilkin is Quilkin itself, if you make testing hard, people won't test. In this respect it's more like other CLI tools like mdbook
or zola
, which have builtin testing capabilities for their projects.
How does that align with everyone's expectations?
Okay, well an interactive shell definitely wasn't what I thinking, not that I'm against it, in fact I think it would be a cool addition to this (quilkin test -i/--interactive
?). But for this issue the quilkin test
I had in mind is more like cargo test
, it runs through a series of tests defined in your configuration, and outputs the result which you could run in your CI. For example; here is what the greet tutorial would look like with a basic test. You'd run this with quilkin test -c quilkin.yaml
.
version: v1alpha1
proxy:
port: 7001
static:
filters:
- name: greet.v1
endpoints:
- address: 127.0.0.1:4321
tests:
hello:
input: Quilkin
output: Hello Quilkin
Running 1 Test:
hello … success ✅
All tests completed successfully.
there was a unrelated to the PR change from -f/--filename to -c/--config. This broke all our docs as well as our container images. This needs to be caught in CI in some way shape or form.
assert-cli
is pretty useful for these kinds of tests, I'd probably use that in conjunction with quilkin-test
for our tests so that everything could be run through cargo test
, but it would be good to use that regardless of this.
The reason to include it in the binary is because we want to be able test configurations outside the repository, and I feel that the only thing you should test Quilkin is Quilkin itself, if you make testing hard, people won't test. In this respect it's more like other CLI tools like mdbook or zola, which have builtin testing capabilities for their projects.
That makes sense! Also agree with having to make this suitable to run in CI. I think we should have a separate config for test however? So in the example the part describing the test stuff will be in a different file
# test.yaml
tests:
hello:
input: Quilkin
output: Hello Quilkin
e.g quilkin test -c quilkin.yaml --spec test.yaml
the reason for this is that one goal of the static config is to be as simple as possible and I feel the test stuff will only grow in complexity (at least it can evolve separately from quilkin config). At least so users who only e.g operate quilkin (which will be the majority) don't need to care about test related things when using the config.
Hmm, I do agree that we probably want to have a way for the tests to exist separately, because if you have a complex configuration. I'm not sure what the best way to represent that at the moment though, because I think there's a couple interesting ways that this could be structured, so will need to think about that more. For example we could have the test point to the configuration its testing e.g.
# test.yaml
config: ./quilkin.yaml # this could be set by default
tests:
hello:
input: Quilkin
output: Hello Quilkin
Another interesting property of YAML is that you can have multiple documents in the same file so we could also have them in separate files but allow you to join them for simple configurations and tutorials like so.
version: v1alpha1
proxy:
port: 7001
static:
filters:
- name: greet.v1
endpoints:
- address: 127.0.0.1:4321
---
tests:
hello:
input: Quilkin
output: Hello Quilkin
Oooh! I like where this is going 👍🏻
Okay, well an interactive shell definitely wasn't what I thinking, not that I'm against it, in fact I think it would be a cool addition to this (quilkin test -i/--interactive?)
Agreed. I'm now having interesting thoughts of being able to convert an interactive session into a test yaml.... 🤔 but that's long term for sure. 😄
I think I'm leaning more towards:
# test.yaml
config: ./quilkin.yaml # this could be set by default
tests:
hello:
input: Quilkin
output: Hello Quilkin
It's a bit more work, but it does two things:
- enables you to point to configs that could be used for anything, like aim at a config in an example folder, but keep the test in a separate place. This would be a nice way to ensure example configs stay functional in our CI.
- Reuse quilkin configs across some/all of the tests in a single test file ("suite"?), which is nice if you want to. You can do this with the multiple definition too, but I think the
config:
option provides more flexbility.
To continue on the multiple documents idea - one yaml
file could essentially be a test suite, that looks something like:
# test.yaml
config: ./quilkin.yaml
tests:
hello:
input: Quilkin
output: Hello Quilkin
---
config: ./quilkin.yaml
tests:
hello:
input: Lorem
output: Hello Lorem
---
config: ./quilkin-suffix.yaml
tests:
hello:
input: Lorem
output: Lorem Hello
Another question: Input and output? Should it be text, or base64 of bytes? Just thinking about how to test compression for example. (or maybe have both options, either use input
or inputBytes
, first being a string, second being bytes)?
Or, when working with a filter like compression, which is usually done in combination with another proxy, would you have a way to chain two running proxies to each other? Or maybe the way to test is to have a compress with a decompress filter in the same Quilkin instance and see if the data going in is the same what comes out?
Looking at this more, would I be correct that if you wanted multiple tests per config, you would actually do the following:
config: ./quilkin.yaml
tests:
hello:
input: Quilkin
output: Hello Quilkin
lorem:
input: Lorem
output: Hello Lorem
So maybe the multiple document + sharing configs might not be as useful?
So maybe the multiple document + sharing configs might not be as useful?
Well I think we want to be able support multiple documents each containing multiple tests, maybe just not through that syntax. But for our own use case I'd like to be have the following layout for example where each file tests a filter and contains multiple tests for that filter.
tests
├── compress.yaml
├── concatenate_bytes.yaml
├── load_balancer.yaml
On the config:
key, how would people feel about the schema being that quilkin-test
accepting both of the proposed layouts? Either you provide a config:
key in TestConfig
, or you include a Config
directly proceeded by a TestConfig
in the YAML, and Quilkin will figure it out. I think this would provide the best user experience as for some things I want to create a configuration specifically for testing and I want to be able to have tests in a single file otherwise there's file sprawl, but I also see utility in being able to split the tests out, as you can have multiple sets of tests for the same configuration, which is useful if you have tests for different environments in production for example, or for separating particularly intensive or long tests.
Another question: Input and output? Should it be text, or base64 of bytes? Just thinking about how to test compression for example. (or maybe have both options, either use input or inputBytes , first being a string, second being bytes)?
I think we always want to accept multiple options, and differentiate it based on what type is provided, as I find that to be the most intuitive (GitHub Actions is a big inspiration for me for how to design configuration). So input
could accept three options, a string
which is sent as plain text, Sequence<Integer>
which is sent as bytes, and an object
where we can provide further encodings such as base64.
version: v1alpha1
proxy:
port: 7001
static:
filters:
- name: greet.v1
endpoints:
- address: 127.0.0.1:4321
---
tests:
text:
input: Quilkin
output: Hello Quilkin
binary:
input: [81, 117, 105, 108, 107, 105, 110]
output: [72, 101, 108, 108, 111, 32, 81, 117, 105, 108, 107, 105, 110]
base64:
input:
data: UXVpbGtpbg==
type: base64
output:
data: SGVsbG8gUXVpbGtpbg==
type: base64
mixed:
input:
data: Quilkin
type: text
output:
data: SGVsbG8gUXVpbGtpbg==
type: base64
Or, when working with a filter like compression, which is usually done in combination with another proxy, would you have a way to chain two running proxies to each other? Or maybe the way to test is to have a compress with a decompress filter in the same Quilkin instance and see if the data going in is the same what comes out?
I don't know if we can/should chain two proxies as we've designed the main Quilkin process to always sit in the middle of a connection, where as where we're checking we're sitting at the end of the connection, so making two Quilkin's behave nicely in a feedback loop in the test runner seems difficult without adding a bunch of specific code. I was thinking though that we would spawn just the Upstream
component for test runner's loopback server, so you could have write
filters that handle the data and perform checks on the server end . This is also where #13 would be really useful as you could write more complex filter tests inside those tests rather than us needing to add configuration options for every kind of test.
So I think I'd actually much rather we pass an explicit config file, one reason being its probably simpler to keep test config entirely separate from proxy config (it really only describes a test case), but mainly because firsthand writing bazel tooling (I suspect most other build tools too?) for these types of things, its much more simpler to be able to pass a config file via cli arguments vs scripting your way into inserting its location in another file - reason being config files are almost always autogenerated in turn (e.g by child bazel targets) and their filepath isn't known until runtime as a result, so instead of writing logic to inject that final path into the test.yaml at runtime (which usually involves hacks), we simply pass the generated filepath to quilkin test
. Also this allows us to reuse the same test.yaml for multiple proxy configs than generating new files with the same content but different filepath for each proxy config.
On the config: key, how would people feel about the schema being that quilkin-test accepting both of the proposed layouts?
That works for me.
I think we always want to accept multiple options, and differentiate it based on what type is provided, as I find that to be the most intuitive (GitHub Actions is a big inspiration for me for how to design configuration). So input could accept three options, a string which is sent as plain text, Sequence<Integer> which is sent as bytes, and an object where we can provide further encodings such as base64.
Love it. That sounds perfect.
I don't know if we can/should chain two proxies as we've designed the main Quilkin process to always sit in the middle of a connection, where as where we're checking we're sitting at the end of the connection, so making two Quilkin's behave nicely in a feedback loop in the test runner seems difficult without adding a bunch of specific code
We kind of have this in our integration tests, but it may not able easy to pull out. I would also argue that it's better to start with a single Quilkin instance, and then expand as we find more use cases / hit pain points. 👍🏻 So I'm onboard for a single instance of Quilkin in a test.
This is also where #13 would be really useful as you could write more complex filter tests inside those tests rather than us needing to add configuration options for every kind of test.
Oooh, that's a really interesting idea! I like it.
So I think I'd actually much rather we pass an explicit config file, one reason being its probably simpler to keep test config entirely separate from proxy config
When you say this, do you mean pass it into the quilkin binary as a separate argument, so something like: quilkin test -c config.yaml --tests test.yaml
?
I'm hesitant on this just because it's so convenient to be able to go quilkin test -f filter-tests.yaml
and it runs 27 tests. On top of that, if we wanted to get really fancy, we could steal from kubectl
(docs) and be able to do a quilkin test -c ./tests/
such that if you point at a directory it runs the tests in all the yaml files it finds in that directory, which would be extra convenient, especially for building out CI.
I think that would be hard to do with a format where we have seperate quilkin-test
arguments for the config and the tests.
reason being config files are almost always autogenerated in turn (e.g by child bazel targets) and their filepath isn't known until runtime as a result
Two thoughts on this:
- If you can include a config at the top of a test yaml, then there are no filepaths. So that should solve that problem in that format I think.
- In the alternative format where we have a
config:
key, as long as we support relative paths, this should be okay? Since the generator doesn't need to know the full path, just how the files exist in relation to another?
its much more simpler to be able to pass a config file via cli arguments vs scripting your way into inserting its location in another file - reason being config files are almost always autogenerated in turn (e.g by child bazel targets) and their filepath isn't known until runtime as a result, so instead of writing logic to inject that final path into the test.yaml at runtime (which usually involves hacks), we simply pass the generated filepath to quilkin test. Also this allows us to reuse the same test.yaml for multiple proxy configs than generating new files with the same content but different filepath for each proxy config.
@iffyio I'm not entirely sure what you're referring to, are you talking about how we'd handle decoding the config? It would look like this, it's not very hacky if you ask me.
struct TestConfig {
config: Config,
test: TestOptions,
}
impl TestConfig {
fn from_str(src: &str) -> Result<Self> {
Ok(if let Ok(test_config) = serde_yaml::from_str::<TestOptions>(src) {
let path = test_config.config.ok_or(Error::MissingKey {key: "config" })?;
let config = serde_yaml::from_reader(std::fs::read(path)?)?;
Self { config, test_config }
} else {
let de = serde_yaml::Deserializer::from_str(src).into_iter();
let config = Config::deserialize(de.next().ok_or(Error::MissingConfig));
let test_config = TestOptions::deserialize(de.next().ok_or(Error::MissingTestOptions));
Self { config, test_config }
})
}
}
@markmandel I'll answer those questions after the weekend, I just wanted to post this because I had left it open but never hit send.
When you say this, do you mean pass it into the quilkin binary as a separate argument, so something like: quilkin test -c config.yaml --tests test.yaml? I'm hesitant on this just because it's so convenient to be able to go quilkin test -f filter-tests.yaml and it runs 27 tests. On top of that, if we wanted to get really fancy, we could steal from kubectl (docs) and be able to do a quilkin test -c ./tests/ such that if you point at a directory it runs the tests in all the yaml files it finds in that directory, which would be extra convenient, especially for building out CI.
Yes I meant passing the config.yaml as a separate argument. That shouldn't change anything in terms of what's possible? we can still run as many tests or pass directories etc in the same command. And since we already have default filepaths/envvars, for most cases the command to run will simply be e.g quilkin test -f filter-tests.yaml
which will automatically use the quilkin.yaml it finds on disk, then tieing the test spec to the config path seems less desirable.
I'm not entirely sure what you're referring to, are you talking about how we'd handle decoding the config? It would look like this, it's not very hacky if you ask me.
Ah no not referring to anything on the quilkin impl side. Rather its more, how do I automate the e2e tests. We'll have different proxy configs per cluster but want to reuse the same tests, the two issues I see with having a config field in test.yaml are
- how do I reuse the test.yaml file? (since it can only point to one file at a time and I wouldn't want to keep an identical copy per cluster)
- proxy configs themselves aren't generated until at runtime (i.e when its time to run the test) so I don't have a filepath to provide in my
test.yaml
on disk. That's where the hack comes in because I'd have to write some bazel code that injects each cluster-specific config filepath into each cluster'stest.yaml
(and I also have to generate a test.yaml copy at runtime for each cluster which also adds more code).
By separating the test.yaml from the config path, I can reuse the same test file and avoid both issues
By separating the test.yaml from the config path, I can reuse the same test file and avoid both issues
Sure, I think that's valid, which is why I think it should accept both. I think the single file use-case is a bit separate from that, the thing I want to avoid is file sprawl, where too many chunks are scattered through the codebase. While there is a case for having the same tests and re-using them, there's also a need for test specific configuration and configuration specific tests, such as when you're testing a filter. One example that comes to mind is the compression filter. If we add a second compression algorithm (like zlib) we'll need separate tests from the snappy tests, because zlib compressed data won't look like snappy compressed data. So if we only have the multiple files option, to test the compression filter, you'd need four files. snappy.yml
, zlib.yml
, snappy-test.yml
, zlib-test.yml
, where as allowing them to embed the tests this becomes two files.
In the alternative format where we have a config: key, as long as we support relative paths, this should be okay? Since the generator doesn't need to know the full path, just how the files exist in relation to another?
Yeah, we shouldn't need to touch or transform the input path, just read it, and print out what the path was if it failed.
yeah I think supporting both a filepath as cli arg and in the test.yaml sounds reasonable to me!
If you can include a config at the top of a test yaml, then there are no filepaths. So that should solve that problem in that format I think. In the alternative format where we have a config: key, as long as we support relative paths, this should be okay? Since the generator doesn't need to know the full path, just how the files exist in relation to another?
for 1) that makes it impossible to reuse the same test for different configs? I wouldn't want to write a new config for each test file I have. for 2) hardcoding relative paths ../../...
don't work too well because they're generally flaky, as you mention, its tied to the current target that generates the file and where it puts the output, it means things can break when that target changes, e.g if someone wraps it with some logic or uses a different target entirely, the ouput path will be different and we need to revisit each test file which could be painful. If we support both at least each setup gets to choose what format to use
- that makes it impossible to reuse the same test for different configs?
You can make multiple tests for the same config in the same file, but not across files.
This would the same as if you were specifying the config on the CLI separately from the test.
Random thought - what if we reused proxy.id
within an inline config to be able to reuse / have multiple configurations within a single file?
Something like:
version: v1alpha1
proxy:
port: 7001
id: greet
static:
filters:
- name: greet.v1
endpoints:
- address: 127.0.0.1:4321
---
version: v1alpha1
proxy:
port: 7001
id: farewell
static:
filters:
- name: farewell.v1
endpoints:
- address: 127.0.0.1:4321
---
tests:
config: greet
text:
input: Quilkin
output: Hello Quilkin
binary:
input: [81, 117, 105, 108, 107, 105, 110]
output: [72, 101, 108, 108, 111, 32, 81, 117, 105, 108, 107, 105, 110]
---
tests:
config: farewell
base64:
input:
data: UXVpbGtpbg==
type: base64
output:
data: SGVsbG8gUXVpbGtpbg==
type: base64
mixed:
input:
data: Quilkin
type: text
output:
data: SGVsbG8gUXVpbGtpbg==
type: base64
(Would be optional, default config
would be the one specified at the top)
You still can't do sharing across test files, but it reduces sprawl if the various config options are related for a test, and you don't want to push a config to an external file. WDYT?