zephyr.js
zephyr.js copied to clipboard
RFC: New organization of tests
We have a lot of test stuff scattered across the repo. We'd like to improve things and know exactly where a particular type of test belongs. We'd like to be able to more easily enumerate and automate running of tests. So we had some discussion on IRC and in person yesterday and I want to document a proposal based on the results, but we still welcome feedback and have open questions.
Current state
Path | Description |
---|---|
samples/tests/ |
Contains mostly informational tests written by developers |
scripts/trlite |
Our sanity / CI test runner mostly has its tests embedded within |
scripts/ |
A couple source check tests are here too |
src/zjs_unit_tests.[ch] |
Some C-only tests that get built into the linux executable |
src/zjs_test_promise.[ch] |
A module to expose promise creation to JS for testing |
tests/ |
Automated (some manual) tests, mostly from QA, some from dev team |
tests/stress |
Contains one test that runs indefinitely |
Considerations
QA vs. Developer Tests
We have tests for QA and development purposes; should these be distinguished and how?
Automated vs. Manual
We have both automated and manual tests. How should they be distinguished?
JavaScript vs. C vs. Hybrid tests
Some tests are pure JavaScript, some unit tests are pure C, and some need extra help from C, e.g. to isolate a particular module for good unit testing. Where should these go?
Board-specific tests
Some tests will only work on certain platforms. We can put tests in board-specific directories, but then what if a test works on a few different platforms; we wouldn't want to duplicate the test N times. So it might be better to have metadata specifying compatibility? Or just run the tests and mark "allowed failures" per platform somewhere. Also, why do the tests only work on certain platforms?
- Platforms lack required hardware (e.g. BLE on k64f)
- Platforms lack required Zephyr driver (e.g. AIO on k64f)
- Other reasons? Do these reasons affect where the test should be and how it's designated?
Hardware-dependent tests
Some tests are automated once you have a certain hardware setup, such as a wire from IO2 to IO3. Should these be considered "automated" or "manual", or something else?
Enhancements
There is plenty more to be done than just rearranging things. We need to be more clear about how tests are run. I recently added make check
but it just calls trlite -l
. What is really expected of it? We should maybe have a smart test runner that can run all the automated tests or named groups of them, maybe help you discover tests, or lead you through interactive testing by building and flashing for you with intervention as required.
We would also plan to experiment with mocking some Zephyr APIs in order to be able to regression test our API binding code automatically. Those would probably show up alongside or within modules, like zjs_gpio_mock.c
.
Proposal
Path | Description |
---|---|
samples/tests/ |
Move these into tests/manual. |
scripts/trlite |
Leave this as the test runner but move embedded tests out into tests/build . |
scripts/ |
Move the source code check scripts to yet-to-be-named place for bash script tests that check the build/source. Leave this directory for other non-test scripts. |
src/zjs_unit_tests.[ch] |
Move to unittests/ with its own Makefile and executable rather than piggybacking on jslinux with --unittests . |
src/zjs_test_promise.[ch] |
This is legitimate for a hybrid test's C portion to be here as a module. It will only be used by JS that requires the special test module, so it's automatically left out of any other build. So the tests that do use it don't need to be specially identified, they will just be in tests/ or tests/manual . It's possible for such code to be added under #ifdef in the module it relates to; unclear which is better but this lets the source be less cluttered. Maybe _test should move to after the module name so they show up consecutively in ls . |
tests/ |
Core automated tests using the Assert.js module so their output can be checked for success with simple grep. However, drop test- prefix for tests as it is redundant with tests/ path. Move out manual tests. |
tests/<board>/ |
Board-specific tests, where <board> is a101, k64f, etc. |
tests/manual/ |
Tests that require manual intervention or (for now, at least) a specific hardware configuration such as external wiring. |
tests/scripts/ |
(Not sure of best name for this dir: bash, build, check, internal, sanity, source trlite?) It would contain bash-script tests pulled out of trlite and scripts/ that check the source code or make sure code still builds in different configs. Maybe check since this is the stuff make check would run. |
tests/stress |
Move to tests/manual , maybe with -stress in filename. |
unittests/ |
New directory for C unit tests and a standalone executable that runs them. |
I'd appreciate feedback! @poussa, @haoxli, @pfalcon, @martijnthe, @cuiyanx
Thanks for coming forward with this (I'm amazed by the amount of the progress in just the last couple of week, great work, guys!). So, please let me answer clauses above/record my own thoughts on the matter. In the hope to make it easier to follow (and write for me too), I'm going to add each idea/subtopic in a separate comment, apologies in advance for notification spam ;-).
So, first I thought about automated vs manual tests dilemma as discussed yet on IRC. Here's my try: automated vs manual tests are pretty different kinds of tests in the way they're being used. However, the following principle should guide the partitioning: "Anything which can be tested automatically should not be tested manually" (because there never will be too little to test manually, actually, there's always overwhelmingly too much, so automation should really cut the load, not just duplicate it). So, it's fair to say that automated vs manual test sets are disjoint, then my suggestion would be to have 2 separate top-level dirs for them:
-
tests
- automated tests -
tests-manual
- manual tests
unittests/ New directory for C unit tests and a standalone executable that runs them.
+1. Except that you may want to make them sort together with other tests dir(s), so:
-
tests-unit
- C-based unit tests
QA vs. Developer Tests We have tests for QA and development purposes; should these be distinguished and how?
Here's my 2 cents on this. You're lucky to have a dedicated QA team. But that should not lead to partitioning of the testsuite based on whether tests are QA's or developers'. Actually, it's in the best QA's interests to have a solid, wide-coverage automated testsuite, because I imagine the biggest problem of the trade is to need to repeat trivial tests all the time, which only can lead to lack of time to test really high-level integrated features which truly require human supervision.
So, IMHO, automated vs manual testsuite partitioning is enough, with "automated" part being truly everyone's. To make it so, there just should be good, uncomplicated, intuitive organization of the automated testsuite and some documentation which backs it, to make it encouraging to use automated testing for everyone - care developers, QA team, external contributors, etc.
Beyond that, I guess manual tests will be "owned" by QA team, like they will set up guidelines and conventions for writing them (but would be nice to have some consistency, e.g. if automated test dir has per-board subdirs, to follow that for manual dir too).
Some tests will only work on certain platforms. We can put tests in board-specific directories, but then what if a test works on a few different platforms; we wouldn't want to duplicate the test N times. So it might be better to have metadata specifying compatibility? Or just run the tests and mark "allowed failures" per platform somewhere. Also, why do the tests only work on certain platforms?
That's great summary and questions. And here's how these issues are oftentimes handled in testsuites - using "skipped" tests. I.e., a test can have 3 statuses: beyond passed/failed, it can be skipped. There're 2 ways to skip tests: a test itself may decide that it can't test something, and raise a special "skipped" status, or test running may decide to skip (group of) test(s), based on the pre-known metadata as you write.
Note that one of the most obvious places to store metadata are directory and (parts of) file names. For example, if all tests for BLE in tests/ble/
dir, there's no head-scratching how to skip them. Or (as an abstract example), if there're dispersed tests which require floating-point, but it's optional, then suffixing such tests with _float
does the trick.
We need to be more clear about how tests are run. I recently added make check but it just calls trlite -l. What is really expected of it?
As I preached before, one of the baseline requirements should be that there's a "every half an hour" subset of the testsuite, easily accessible (like already offered by make check
) and running quickly (say, 2mins a cap). To encourage a process, where a developer starts a new session with running it to make sure tree state's good, makes a change, runs the testsuite to confirm nothings broken, write a test for a change, because it's all so easy and pleasant (swap steps for TDD).
Note that the criteria should be not just "we run whatever fits in 2 mins", but "we try to optimize test infra, to fit as much as possible in 2 mins".
Beyond that, it should easily scale to running more complete/thorough testsuite set (i.e. every developer should know how to do that, and curve towards it should be smooth (no magic required like git operations, capturing unfinished changes, with bugs which may throw them away, etc. ;-) ))
Hardware-dependent tests Some tests are automated once you have a certain hardware setup, such as a wire from IO2 to IO3. Should these be considered "automated" or "manual", or something else?
Per the previous comment, these should be clearly separated from "everyday" automated testsuite, so "make check" doesn't rely on them. And yet such are indeed a separate group of tests, and it would be nice to capture the difference. Overall, I guess following criteria might work:
- Tests which require just a raw standard board, and thus would work for everyone.
- Manual tests which require human supervision to run, and likely human to decide whether a test passed or failed.
- "Assisted setup" tests, which require a human to setup a testing environment, but afterwards may run in automated manner, repeatedly without further assistance.
Group 3 should include all such tests, e.g. both those which require connecting IO2 to IO3, and which require a DS18B20 to be attached to some pin, and thus someone who wants to run this subset completely should have a complete hardware setup. Otherwise, it just gets too detailed and not manageable. This should be still ok, because the test runner should allow to run any individual test(s) explicitly.
tests/stress Move to tests/manual, maybe with -stress in filename.
A manual stress test sounds like an oxymoron to me. What's exactly being stressed and how would that work? A test which prints a million of lines, requiring a human to press a button after each? ;-)
IMHO, all stress tests are by definition automated (maybe requiring assisted setup per the previous comment).
So, tests/stress sounds a good location, just any tests there should be skipped for "make check".
So it might be better to have metadata specifying compatibility?
I like this better compared to board dirs, because sooner or later you'll have boards with overlapping capabilities and things become messy.
Move to unittests/ with its own Makefile and executable rather than piggybacking on jslinux with --unittests.
Makes a lot of sense to me. FWIW, for C unit tests, we are looking at using Criterion (http://criterion.readthedocs.io/en/master/).
I think it also makes sense to split off pure JS unit tests that also can be run on the final HW (probably only possible if they don't rely on HW/mocks/...). You could specify in the test metadata that the test is able to be run by a unit test runner on a development machine, as well as by a runner on the final HW.
Hybrid Tests
I'm guessing this is for tests where you control a mock from the C side while testing JS APIs. In my mind, it would be great to have a mocking tool to be able to generate mocks that can be controlled from JS code, but under the hood do whatever you'd currently do in C. The benefit is that the whole test can be written in JS and run in other places (i.e. a browser simulator).
Automated vs. Manual
I think it makes sense to try to come up with a "format" for manual tests too. I.e. always output the test results in the same way to the serial console (i.e. [PASS]
or [FAIL: 'error msg']
) and print an instruction for each step that the manual tester is supposed to take in the same format. This would get you one step closer to automating the manual tests.
@pfalcon, Heh - re: "manual stress" test. I guess what I was thinking about was i) those aren't quick tests that you want to run frequently in an automated suite, ii) the ones we have are looking for memory leaks basically so they don't PASS/FAIL, they either keep working forever or they wedge at some point. But yeah I guess if they are set to run for a fixed number of loops, and the test runner could give up on them after some timeout, they could be automated.
QA vs. Developer Tests We have tests for QA and development purposes; should these be distinguished and how?
Agree with @pfalcon, I think everyone can contribute tests to the project, not only QA and developer, QA is just one contributor of them who mainly responsible for this, and will cover all tests execution, either automated or manual tests.
Automated vs. Manual We have both automated and manual tests. How should they be distinguished?
Could we consider to keep auto and manual tests together for same feature, this will be more convenient to find and run the tests for feature checking, especially for who is not familiar with test layout. How to distinguish them? Add -manual
postfix for manual tests(W3C web-platform-tests do the same), so that we can filter auto and manual easily.
Proposal tests/ - However, drop test- prefix for tests as it is redundant with tests/ path.
Agree.