jqwik icon indicating copy to clipboard operation
jqwik copied to clipboard

Stateful example and model checking

Open gtrefs opened this issue 6 years ago • 11 comments

Testing Problem

I think the stateful stack example is not model checking. Or at least the model that is used to verify the implementation is the implementaion itself. Why? Let's revisit the code of Pop again:

	@Override
	public boolean precondition(MyStringStack model) {
		return !model.isEmpty();
	}

	@Override
	public MyStringStack run(MyStringStack model) {
		int sizeBefore = model.size();
		String topBefore = model.top();

		String popped = model.pop();
		Assertions.assertThat(popped).isEqualTo(topBefore);
		Assertions.assertThat(model.size()).isEqualTo(sizeBefore - 1);
		return model;
	}

	@Override
	public String toString() {
		return "pop";
	}

Parameter model in method run is not a model of MyStringStack but an instance of the class. That is, in the postconditions the implementation of MyStringStack is validated against itself. For example, Assertions.assertThat(popped).isEqualTo(topBefore) verifies that the popped element was on top of the stack. Why is this suboptimal?

By ensuring that the real system behaves like the model does, we show that our programs are most likely correct. We have a very simple abstract representation of what the system should do (taken from Property-Based Testing with PropEr, Erlang, Elixir). I think this means, we should trust our model. For example, a List could be a model for MyStringStack:

	public Pair<MyStringStack, List<String>> run(MyStringStack toBeChecked, List<String> model) {
		String popped = toBeChecked.pop();
                String removed = model.remove(model.size() - 1);
		Assertions.assertThat(popped).isEqualTo(removed);
		Assertions.assertThat(toBeChecked.size()).isEqualTo(model.size());
		return Pair.of(toBeChecked, model);
	}

Of course, you have to make sure that model contains the same elements.

Suggested Solution / Discussion

I think the stateful API needs to adapted in way to support this. For example, how to apply preconditions to a model etc.? Following the PropEr book, the model can be taken to generate a list of actions before executing them against an implementation. Maybe implementation and model could share a common interface? Also, I think postcondition should be factored out as a concept. Running an action and checking if the action is valid are two different kind of responsibilities. Seperating them would make composing postconditions easier.

gtrefs avatar Dec 16 '19 13:12 gtrefs

@gtrefs Thanks for starting a discussion I've been trying to avoid for some time ;-)

The example you refer to is definitely not a model-based property but a simple state-based property. Sometimes I can think of sufficiently strong post-conditions and invariants from the implementation itself without having to come up with an explicit model of the intended behaviour. I think it's worth to have an easy and non-overhead way to formulate state-based tests like that because they require less effort than full-blown model-based tests.

Should postconditions have their own place in state-based properties? I'm also not sure about that because then there'd be two different places in which one can do postcondition checking. Having two options to do the same feels like a bit of a design smell. On the other hand it might make pc checking more obvious. How would you make post-conditions explicit?

Real Model-based Properties

You can use the existing interface for real model-based properties. Have a look at the CircularBuffer example here: https://github.com/jlink/pbt-workshop/tree/master/src/test/java/pbt/solution4 It's still not using the explicit model in a strict "compare model to implementation behaviour" way but it would be possible.

That said there is one big difference between the way model-based testing is done in PropEr and the way it currently works in jqwik: An explicit model allows to use its preconditions in order to generate the sequence of actions; PropEr and PropCheck have a dedicated phase in which they use the model for sequence creation. jqwik, however, creates the sequence without considering pre-conditions and throws non-fitting actions away at runtime. Most of the time results are similar but there are cases where a dedicated sequence creation phase will result in more powerful properties. The downside is that shrinking sequences will get even more complicated.

Optimal Solution

I would love to somehow merge the two things: simple state-based properties and real model-based properties. Users should still be able to formulate the former without overhead and the latter should be as obvious as in your Pair-based example. I'm still not sure if state-based props should be build using model-based ones or the other way round. Would you be interested in working collaboratively on an API that considers that?

jlink avatar Dec 17 '19 07:12 jlink

Not replying to all of your comments, because this is a lot to digest.

I don't disagree that a model sometimes seems a bit excessive and can be avoided. However, I think there are different mind models in play. Let me explain this. Testing as you did in your tests requires that you trust the top method of your stack works properly. That is you have to trust part of the implementation which you are currently testing. When you have a model you can put all this trust on the model. For sure, this can also go wrong but at least there is now a clear distinction between something I can trust and something I want to test. I agree sometimes finding a model can be hard but in the stack example a list seems to be alright. Hypothetical, how would you test when the stack would not have a top method?

gtrefs avatar Dec 18 '19 15:12 gtrefs

"Testing as you did in your tests requires that you trust the top method of your stack works properly": Not really. Here's the thought experiment: Which broken implementation of top() would not be detected by the properties as they are? I'm having difficulties to come up with something.

I think the properties for MyStringStack as they currently are have the same strengths and weaknesses as design by contract as a general approach has: All you have is local information before and after invoking a function. In many cases that's plenty in others we'll be missing out on important aspects.

"Hypothetical, how would you test when the stack would not have a top method?" That's a good exercise, which will probably need a "model" of some kind, and which may also reveal the weaknesses of the current Action and Sequence abstractions. I'll work something out (hopefully) in the days to come. Hold your breath!

BTW, I've just renamed model to state in all state-based code and examples to make it more obvious that it's not really model-based property testing.

jlink avatar Dec 19 '19 07:12 jlink

"Should postconditions have their own place in state-based properties? I'm also not sure about that because then there'd be two different places in which one can do postcondition checking. Having two options to do the same feels like a bit of a design smell. On the other hand it might make pc checking more obvious. How would you make post-conditions explicit?"

A first step I can think of is to create a postcondition method, like you did with the preconditions method? I could also think of having explicit Precondition and Postcondition types like you did with Invariant.

gtrefs avatar Dec 19 '19 08:12 gtrefs

"Most of the time results are similar but there are cases where a dedicated sequence creation phase will result in more powerful properties. The downside is that shrinking sequences will get even more complicated."

Can you explain this a bit more?

gtrefs avatar Dec 19 '19 08:12 gtrefs

The difference between preconditions and postconditions is that preconditions must be explicit because they are used to determine if an action in the sequence is valid at all. Postconditions are "only" checks that can happen anywhere. That also holds in principle for invariants, which are only special because they span actions. So many design options. Need time to consider. Happy about concrete design examples from your side.

jlink avatar Dec 19 '19 19:12 jlink

As for generating sequences without considering preconditions: In an extreme case most actions from a randomly generated sequence might not be applicable at runtime when considering preconditions. Since the current approach will notice that only when running the sequence the effective length of sequences - after filtering out actions that don't fit preconditions - might become very short and thus worthless.

Most of the time one can work around such a problem by having slightly different types of actions. Using preconditions already during sequence generation can mitigate this problem but will render efficient shrinking implementation considerably more complicated. At least it looked to me like that while trying to implement a prototype a few months ago.

jlink avatar Dec 19 '19 19:12 jlink

Here's a first take...

Imagine class StringStack with the following public interface:

public class StringStack {
	public void push(String element) { ... }
	public String pop() { ... }
	public void clear() { ... }
	public boolean isEmpty() { ... }
	public int size() { ... }
}

Given that we want to use a List<String> as a model implementation to compare it against, a Model usable in model-based property testing could have the following generic interface:

public interface Model<S, M> {

	void assertState(S state, M modelState);

	Arbitrary<ModelAction<S, M>> actions();

	interface ModelAction<S, M> {
		default boolean precondition(M modelState) { return true; }
		S runOnState(S state);
		M runOnModel(M modelState);
		default void assertPostcondition() {}
	}
}

Using only model state in precondition would allow to use it during sequence generation. The concrete model for StringStack could look like that:

public class StringStackModel implements Model<StringStack, List<String>> {

	@Override
	public void assertState(StringStack stack, List<String> list) {
		Assertions.assertThat(stack.size()).isEqualTo(list.size());
		Assertions.assertThat(stack.isEmpty()).isEqualTo(list.isEmpty());
	}

	@Override
	public Arbitrary<ModelAction<StringStack, List<String>>> actions() {
		Arbitrary<ModelAction<StringStack, List<String>>> push =
				Arbitraries.strings()
						   .alpha().ofLength(5)
						   .map(PushAction::new);
		Arbitrary<ModelAction<StringStack, List<String>>> clear = Arbitraries.constant(new ClearAction());
		Arbitrary<ModelAction<StringStack, List<String>>> pop = Arbitraries.constant(new PopAction());
		return Arbitraries.oneOf(push, clear, pop);
	}

	private static class PushAction implements ModelAction<StringStack, List<String>> {

		private final String element;

		private PushAction(String element) {
			this.element = element;
		}

		@Override
		public String toString() {
			return String.format("push(%s)", element);
		}

		@Override
		public StringStack runOnState(StringStack stack) {
			stack.push(element);
			return stack;
		}

		@Override
		public List<String> runOnModel(List<String> list) {
			list.add(element);
			return list;
		}
	}

	private static class ClearAction implements ModelAction<StringStack, List<String>> {

		@Override
		public StringStack runOnState(StringStack stack) {
			stack.clear();
			return stack;
		}

		@Override
		public List<String> runOnModel(List<String> list) {
			list.clear();
			return list;
		}

		@Override
		public String toString() {
			return "clear";
		}
	}

	private static class PopAction implements ModelAction<StringStack, List<String>> {

		private String poppedFromStack;
		private String poppedFromModel;

		@Override
		public boolean precondition(List<String> list) {
			return !list.isEmpty();
		}

		@Override
		public StringStack runOnState(StringStack stack) {
			poppedFromStack = stack.pop();
			return stack;
		}

		@Override
		public List<String> runOnModel(List<String> list) {
			poppedFromModel = list.remove(list.size() - 1);
			return list;
		}

		@Override
		public void assertPostcondition() {
			Assertions.assertThat(poppedFromStack).isEqualTo(poppedFromModel);
		}

		@Override
		public String toString() {
			return "pop";
		}
	}
}

I haven't actually implemented the mechanism to use this model on top of the existing state-based support of jqwik. I think that is straight-forward - unless I'm missing something.

Interface Model could also look differently when combining the two run-methods into a single one. Which would also allow to get rid of assertPostcondition() since that could be done in run();

public interface Model<S, M> {

	void assertState(S state, M modelState);

	Arbitrary<ModelAction<S, M>> actions();

	interface ModelAction<S, M> {
		default boolean precondition(M modelState) { return true; }
		Tuple2<S,M> run(S state, M modelState);
	}
}

Any thoughts?

jlink avatar Dec 20 '19 11:12 jlink

I made the example runnable. You find it here: https://github.com/jlink/property-based-testing/tree/master/pbt-java/src/test/java/pbt/stateful/stackWithModel

jlink avatar Dec 21 '19 12:12 jlink

The property looks like that:

class StringStackProperties {
	@Property(tries = 100)
	void checkMyStackMachine(@ForAll("sequences") ActionSequence<Tuple2<StringStack, List<String>>> sequence) {
		Tuple2<StringStack, List<String>> initial = Tuple.of(new StringStack(), new ArrayList<>());
		sequence.run(initial);
	}

	@Provide
	Arbitrary<ActionSequence<Tuple2<StringStack, List<String>>>> sequences() {
		StringStackModel model = new StringStackModel();
		ModelRunner<StringStack, List<String>> modelModelRunner = new ModelRunner<>(model);
		return modelModelRunner.sequences();
	}
}

jlink avatar Dec 21 '19 12:12 jlink

I changed the ModelAction interface to allow for precondition equivalence checking between model and state:

interface ModelAction<S, M> {
	default boolean preconditionOnModel(M modelState) {
		return true;
	}

	default boolean preconditionOnState(S state) {
		return true;
	}

	S runOnState(S state);

	M runOnModel(M modelState);

	default void assertPostcondition() {
	}
}

jlink avatar Dec 23 '19 11:12 jlink