motoko icon indicating copy to clipboard operation
motoko copied to clipboard

Developer can run unit tests

Open ghost opened this issue 5 years ago • 15 comments

Big question: "Where" do we run the tests? In local Replica? require WasmTime?

(incl. assertion library, BigTest framework)

ghost avatar Aug 18 '20 15:08 ghost

What constitutes a Unit?

  • Function/Class
  • Canister

It turns out that for our purposes Functions and Classes are equivalent. The shared quality here is that they don't require asynchrony and thus no replica. While we can tests functions or classes through canister tests, any test that could just test a function or class should do so. Why? Because function/class tests run a lot faster and a tests value increases immensely in how often you can use it to get feedback. I think we should advise developers to run these tests through wasmtime.

We still absolutely need to be able to write tests that treat canisters as the unit under test, as that's a very natural way to define a unit boundary. A canister test is also the only way to observe traps, as traps in synchronous situations just take down the test runner with them.

What constitutes a test?

A useful framing is to split a test into three steps

  • Given

    The given part describes the state of the world before you begin the behavior you're specifying in this scenario. You can think of it as the pre-conditions to the test.

  • When

    The when section is that behavior that you're specifying.

  • Then

    Finally the then section describes the changes you expect due to the specified behavior.

When testing a function this means:

  1. Given: Prepare the function arguments
  2. When: Call the function
  3. Then: Assert the function result

When testing a class this means:

  1. Given: Create an object of the class under test. (Call its constructor)
  2. When: Call a method on the class
  3. Then: Assert the method's result, or a property of the class

When testing a canister this means:

  1. Given: We have to assume the canister under test is installed on the network.
  2. When: Call a sequence of methods on the canister. Why a sequence? Because we want the Then section here to be synchronous
  3. Then: Assert the method's result.

Why testing a canister is more complex

  • Involves the replica, and having installed the canister under test as well as the test canister.

  • We want to be able to observe traps, so we need to run the tests from a different canister than the canister under test.

  • A test canister can't just output to the console, it has to return the test result as candid data (TAP?)

  • The test canister could run out of gas during a test run, so we need to be able to run batches of tests and have the test canister be able to pick up where it stopped the last time around. I don't think we can make dfx roundtrip for every single test, because that would slow execution to a crawl.

  • Canister testing like this means we can't test query methods (via queries), because the test canister will call them as cross-canister calls (as an update call).

Proposal for the Canister Testing Protocol

  • In: Configuration (size of batches?)

  • Out:

    {
      #start : { testNum : Nat };
      #step : { completed : List<TestResult>; }; // Maybe add a nonce here?
      #done : { done : List<TestResult>; summary : Summary };
    }
    

    Where #start or #step means that dfx should call the test again for it to continue.

Requiring a #start response at the start makes it so every test run involves at least 2 canister calls, but at the same time it means we'll always get a quick header we can display before the first test starts running. Happy to discuss that one.

Action items

  • [ ] Figure out the Canister Testing Protocol @stanleygjones Who on the SDK team should I bother with this?
  • [x] matchers covers the function/class testing area good enough for Sodium. It could use some utilities to make the canister testing easier though. I'm thinking of batching and producing whatever format we end up using to communicate testing results here.
  • [x] Document some of the more general testing methodology in here with a few Motoko examples?

kritzcreek avatar Aug 31 '20 14:08 kritzcreek

I've added a module to support testing canisters to the matchers library: https://kritzcreek.github.io/motoko-matchers/Canister.html

An example setup with a bit of documentation on the different pieces can be found here: https://github.com/kritzcreek/ic101

The only thing missing at this point is the automation for repeatedly calling the test function on the canister containing the unit tests. Is there a "scriptable" version of the candid tooling I could use for this? Otherwise I'd skip this for now and just make the batch size large enough that the unit tests complete in a single message. /cc @chenyan-dfinity @matthewhammer

kritzcreek avatar Sep 02 '20 14:09 kritzcreek

Not sure why you need a candid tool, but there is a CLI: https://github.com/dfinity/candid/tree/master/tools/didc There is also a CLI for IC: https://github.com/dfinity-lab/agent-rust/tree/next/icx

chenyan-dfinity avatar Sep 02 '20 17:09 chenyan-dfinity

Not sure why you need a candid tool, but there is a CLI

That decodes the binary format into the textual one, but dfx already does that. I need to be able to tell whether I'm done calling dfx canister call ... or if I need to keep going based on which one of these I get (probably in bash? shudders):

{
  #start : Nat;
  #step : [TestResult]; 
  #done : [TestResult]
}

(something like that)

kritzcreek avatar Sep 02 '20 17:09 kritzcreek

So you want a progress bar somehow? Can you make a query call to the canister to check the status?

chenyan-dfinity avatar Sep 02 '20 17:09 chenyan-dfinity

I basically need to trampoline off of whatever calls the method on the canister, because at some point we won't have enough gas to run all the unit tests in a single message. The canister returns candid values when called which say whether it's done running all the tests or if it needs to be called again to continue. I need this "if not done call again and collect all the results" logic somewhere.

If this was JSON I'd hack something with bash and jq until we got proper support in dfx but with candid I don't think I have the tools to do this, so I'm just going to punt on this for now and just tell people to run dfx canister call a hardcoded number of times in their test suites...

kritzcreek avatar Sep 02 '20 17:09 kritzcreek

Maybe candiff can help, e.g. didc diff (variant { done = vec { result } }) $(dfx call result)

chenyan-dfinity avatar Sep 02 '20 17:09 chenyan-dfinity

I've made the tests complete in a single message for now and detect whether they succeeded by matching with a regex. That's good enough to run CI for a Motoko canister project.

@stanleygjones think we're good with these tests and this way of running them for Sodium?

kritzcreek avatar Sep 03 '20 05:09 kritzcreek

The only thing missing at this point is the automation for repeatedly calling the test function on the canister containing the unit tests. Is there a "scriptable" version of the candid tooling I could use for this?

Depend on your “scripting” language, I guess… :-)

nomeata avatar Sep 03 '20 06:09 nomeata

I basically need to trampoline off of whatever calls the method on the canister, because at some point we won't have enough gas to run all the unit tests in a single message. The canister returns candid values when called which say whether it's done running all the tests or if it needs to be called again to continue. I need this "if not done call again and collect all the results" logic somewhere.

I'm a little confused here. Are you re-describing the problem also solved by BigTest, or something different here?

Perhaps take a look at what's there? Happy to chat further about it.

If there's some path forward where we combine these efforts (naturally, not forcing it), that'd be nice.

matthewhammer avatar Sep 16 '20 22:09 matthewhammer

Depend on your “scripting” language, I guess… :-)

BigTest has a scripting language built as a DSL within Motoko, as it should be. : )

matthewhammer avatar Sep 16 '20 22:09 matthewhammer

(BTW: Sorry to be late on this; it seems like the most active parts of the conversation where during my PTO week earlier this month.)

matthewhammer avatar Sep 16 '20 22:09 matthewhammer

Another test we want is the upgrade test, specifically we want to test the pre/post-upgrade function. If there is a bug there, we break user data.

chenyan-dfinity avatar Sep 22 '20 15:09 chenyan-dfinity

Another test we want is the upgrade test

Good point. That has to be (necessarily) a multi-canister test, where one acts as the controller of the other, right?

Could a canister serve this test from itself, e.g., if it wanted to test itself in response to a special message selfTest?

matthewhammer avatar Sep 29 '20 11:09 matthewhammer

I think you can set the controller to be yourself and call the management canister for upgrade.

chenyan-dfinity avatar Sep 30 '20 23:09 chenyan-dfinity