team
team copied to clipboard
Testing CLI apps
(moderated summary by @epage)
Context
Common inputs to a CLI
- Files
- Command line flags
- Environment variables
- stdin
- signals
Common outputs to a CLI
- Files (sometimes unique, sometimes mutating the input)
- stdout
Plan of Attack
Testing crate(s)
- Make it easy to initialize a tempdir
- Copy files over from tests dir
touchfiles- dump string to a file in tempdir
- cmd assertions
- stdout
- exit code
- file system assertions
- file exists
- file content assertions
- fixture dir is a subset of target dir, reporting the differences
- fixture dir exactly matches target dir, reporting the differences
In-ecosystem resources
- tempfile
- dir-diff
- assert_cli
- cli_test_dir
- Can serve as inspiration but changing it based on input from this WG seems limited
- From a deleted issue: "At this point, the API of cli_test_dir is unlikely to break backwards compatibility in any major ways. I'm happy to add new features to it that people need. And if anybody would like to re-export parts of it, I'm happy to do that, too."
Challenges
- Mocking
- Should this be a priority?
- For unit tests, we should probably encourage a higher level abstraction over the file system that can instead be mocked
- For integration and end-to-end tests, we should probably run against real resources
- https://github.com/iredelmeier/filesystem-rs
- https://github.com/twmb/rsfs
- https://github.com/manuel-woelker/rust-vfs
- Should this be a priority?
- Symbolic links on windows
- Testing colored output
- Testing signal handling
@killercup's original post
In the first meeting, we identified that testing CLI apps cross-platform is not trivial and that we want to improve the situation.
- Best practices, i.e. testing application code, and not duplicating tests from deps
- how to set up clean filesystem/etc environments for testing (containers? chroot jails?
- tempfile helps
- dir-diff helps with validation
- Mocking
Crates we can help:
- get assert_cli to 1.0 (@killercup and @epage)
- get dir-diff to 1.0 (@steveklabnik)
Along with dir-diff (which I have PR out to make it more customizable / richer reporting), something I've been considering is different file system assertions, kind of like the expect*s on TestDir
Interested in working on this :+1:
I will add a note here that testing your application with symbolic links (if relevant) is important, and doing it on Windows can be quite difficult. In one sense, it's hard because unlike in Unix, you have two different methods for creating symbolic links (one for directories and another for files). In another sense though, and the part that has been a real pain point for me, is that Windows by default doesn't let you create symbolic links. Instead, you need to toggle some permission setting for it to work. I believe Appveyor handles this for you, but if you're testing on a real Windows machine, you need to do it yourself.
We've been talking about integration testing for a bit on gitter. I've created the assert-rs github organization and moved assert_cli there.
Testing with FS access could be done with
- https://github.com/iredelmeier/filesystem-rs
- https://github.com/twmb/rsfs
- https://github.com/manuel-woelker/rust-vfs
So we have crates for emulating FS access in tests.
My current thoughts
- Make it easy to initialize a tempdir
- Copy files over from tests dir
touchfiles- dump string to a file in tempdir
- file system assertions
- file exists
- file content assertions
- fixture dir is a subset of target dir, reporting the differences
- fixture dir exactly matches target dir, reporting the differences
file system mocking feels off to me. imo dealing with a file system means you are writing integration / end-to-end tests. Generally there is an application-specific abstraction that can exist above the file system that can instead be mocked. This can work for both unit and integration tests. For end-to-end tests, you want to verify the file system interactions and shouldn't mock it.
I've updated the original issue to try to summarize what we've come up with so far.
@killercup's tests for waltz are an experiment in a high level, more declarative way of writing a test for a program.
Our thought has been to create individual building blocks and then put them together into something kind of like whats in waltz's tests. The priority is on the building blocks.
A possibly controversial and further down the road thing I was considering was the need for various types of tempdirs. The program might be run within one and you might need to pass some to the program as flags. Ideally, the test framework would track these tempdirs and close them at the end, ensuring they can (on windows) and that they don't report errors.
This got me wondering if we'll want some form of string templating in this so you can make the tempdir's path available as a variable that can then be accessed when creating a flag.
Another issue about testing CLIs is to correctly calculate code coverage. If using std::process::Command to fork and execute a CLI, all existing code coverage tools (tarpaulin, gcov, kcov, etc) will failed to calculate the statistics. There is not a better way to properly calculate the line coverage including unit tests and integration tests.
@mssun That is a good point but I'm expecting that won't be too much of an issue. End-to-end testing using assert_cli should, ideally, be reserved for the parts of the application that cannot be tested at a lower level. The coverage will be lower and a developer will need to be aware of what is or isn't being tested because of these end-to-end tests.
If there is an enterprising person who has an idea on how to solve that, great! Otherwise, we should call this out as an known issue in assert_clis documentation.
Ideally, we can use gcov to solve this issue like this https://jbp.io/2017/07/19/measuring-test-coverage-of-rust-programs.html. However, it is not as easy as I thought.
It is impossible to build both binary and tests with -Copt-level=1 -Clink-dead-code -Ccodegen-units=1 -Zno-landing-pads -Cpasses=insert-gcov-profiling -L/Library/Developer/CommandLineTools/usr/lib/clang/9.1.0/lib/darwin/ -lclang_rt.profile_osx flags only.
Because this command cargo rustc --test tests -vv -- -Copt-level=1 -Clink-dead-code -Ccodegen-units=1 -Zno-landing-pads -Cpasses=insert-gcov-profiling -L/Library/Developer/CommandLineTools/usr/lib/clang/9.1.0/lib/darwin/ -lclang_rt.profile_osx will only build test with the flags.
And using the RUSTFLAGS environment variable won't solve this problem. It will add compilation flags to all crates. This will generate gcna and gcno files for other non-related crates in sysroot. This may also introduce some bugs of gcov when merging gcno (still don't know why).
Overall, I think the testing coverage issue is related to many parties such as: cargo, rustc, coverage tools, etc.