team Testing CLI apps

(moderated summary by @epage)

Context

Common inputs to a CLI

Files
Command line flags
Environment variables
stdin
signals

Common outputs to a CLI

Files (sometimes unique, sometimes mutating the input)
stdout

Plan of Attack

Testing crate(s)

Make it easy to initialize a tempdir
- Copy files over from tests dir
- touch files
- dump string to a file in tempdir
cmd assertions
- stdout
- exit code
file system assertions
- file exists
- file content assertions
- fixture dir is a subset of target dir, reporting the differences
- fixture dir exactly matches target dir, reporting the differences

In-ecosystem resources

tempfile
dir-diff
assert_cli
cli_test_dir
- Can serve as inspiration but changing it based on input from this WG seems limited
- From a deleted issue: "At this point, the API of cli_test_dir is unlikely to break backwards compatibility in any major ways. I'm happy to add new features to it that people need. And if anybody would like to re-export parts of it, I'm happy to do that, too."

Challenges

Mocking
- Should this be a priority?
  - For unit tests, we should probably encourage a higher level abstraction over the file system that can instead be mocked
  - For integration and end-to-end tests, we should probably run against real resources
- https://github.com/iredelmeier/filesystem-rs
- https://github.com/twmb/rsfs
- https://github.com/manuel-woelker/rust-vfs
Symbolic links on windows
Testing colored output
Testing signal handling

@killercup's original post

In the first meeting, we identified that testing CLI apps cross-platform is not trivial and that we want to improve the situation.

Feb 20 '18 19:02 killercup

Best practices, i.e. testing application code, and not duplicating tests from deps
how to set up clean filesystem/etc environments for testing (containers? chroot jails?
- tempfile helps
- dir-diff helps with validation
- Mocking

Feb 20 '18 19:02 killercup

Crates we can help:

get assert_cli to 1.0 (@killercup and @epage)
get dir-diff to 1.0 (@steveklabnik)

Feb 20 '18 19:02 killercup

Along with dir-diff (which I have PR out to make it more customizable / richer reporting), something I've been considering is different file system assertions, kind of like the expect*s on TestDir

Feb 20 '18 19:02 epage

Interested in working on this :+1:

Feb 22 '18 20:02 Dylan-DPC-zz

I will add a note here that testing your application with symbolic links (if relevant) is important, and doing it on Windows can be quite difficult. In one sense, it's hard because unlike in Unix, you have two different methods for creating symbolic links (one for directories and another for files). In another sense though, and the part that has been a real pain point for me, is that Windows by default doesn't let you create symbolic links. Instead, you need to toggle some permission setting for it to work. I believe Appveyor handles this for you, but if you're testing on a real Windows machine, you need to do it yourself.

Feb 22 '18 23:02 BurntSushi

We've been talking about integration testing for a bit on gitter. I've created the assert-rs github organization and moved assert_cli there.

Mar 02 '18 18:03 killercup

Testing with FS access could be done with

https://github.com/iredelmeier/filesystem-rs
https://github.com/twmb/rsfs
https://github.com/manuel-woelker/rust-vfs

So we have crates for emulating FS access in tests.

Mar 09 '18 17:03 matthiasbeyer

My current thoughts

Make it easy to initialize a tempdir
- Copy files over from tests dir
- touch files
- dump string to a file in tempdir
file system assertions
- file exists
- file content assertions
- fixture dir is a subset of target dir, reporting the differences
- fixture dir exactly matches target dir, reporting the differences

file system mocking feels off to me. imo dealing with a file system means you are writing integration / end-to-end tests. Generally there is an application-specific abstraction that can exist above the file system that can instead be mocked. This can work for both unit and integration tests. For end-to-end tests, you want to verify the file system interactions and shouldn't mock it.

Mar 09 '18 17:03 epage

I've updated the original issue to try to summarize what we've come up with so far.

Mar 22 '18 14:03 epage

@killercup's tests for waltz are an experiment in a high level, more declarative way of writing a test for a program.

Our thought has been to create individual building blocks and then put them together into something kind of like whats in waltz's tests. The priority is on the building blocks.

A possibly controversial and further down the road thing I was considering was the need for various types of tempdirs. The program might be run within one and you might need to pass some to the program as flags. Ideally, the test framework would track these tempdirs and close them at the end, ensuring they can (on windows) and that they don't report errors.

This got me wondering if we'll want some form of string templating in this so you can make the tempdir's path available as a variable that can then be accessed when creating a flag.

Mar 23 '18 02:03 epage

Another issue about testing CLIs is to correctly calculate code coverage. If using std::process::Command to fork and execute a CLI, all existing code coverage tools (tarpaulin, gcov, kcov, etc) will failed to calculate the statistics. There is not a better way to properly calculate the line coverage including unit tests and integration tests.

Apr 30 '18 18:04 mssun

@mssun That is a good point but I'm expecting that won't be too much of an issue. End-to-end testing using assert_cli should, ideally, be reserved for the parts of the application that cannot be tested at a lower level. The coverage will be lower and a developer will need to be aware of what is or isn't being tested because of these end-to-end tests.

If there is an enterprising person who has an idea on how to solve that, great! Otherwise, we should call this out as an known issue in assert_clis documentation.

Apr 30 '18 18:04 epage

Ideally, we can use gcov to solve this issue like this https://jbp.io/2017/07/19/measuring-test-coverage-of-rust-programs.html. However, it is not as easy as I thought.

It is impossible to build both binary and tests with -Copt-level=1 -Clink-dead-code -Ccodegen-units=1 -Zno-landing-pads -Cpasses=insert-gcov-profiling -L/Library/Developer/CommandLineTools/usr/lib/clang/9.1.0/lib/darwin/ -lclang_rt.profile_osx flags only.

Because this command cargo rustc --test tests -vv -- -Copt-level=1 -Clink-dead-code -Ccodegen-units=1 -Zno-landing-pads -Cpasses=insert-gcov-profiling -L/Library/Developer/CommandLineTools/usr/lib/clang/9.1.0/lib/darwin/ -lclang_rt.profile_osx will only build test with the flags.

And using the RUSTFLAGS environment variable won't solve this problem. It will add compilation flags to all crates. This will generate gcna and gcno files for other non-related crates in sysroot. This may also introduce some bugs of gcov when merging gcno (still don't know why).

Overall, I think the testing coverage issue is related to many parties such as: cargo, rustc, coverage tools, etc.

Apr 30 '18 21:04 mssun

team team copied to clipboard

Testing CLI apps

Context

Plan of Attack

Testing crate(s)

In-ecosystem resources

Challenges

@killercup's original post

team
team copied to clipboard