gopter Stateful testing: Clarification around InititalStateGen and PostConditions

Stateful testing: Clarification around InititalStateGen and PostConditions

Open ebuchman opened this issue 3 years ago • 4 comments

Hello - thanks for a great testing library!

I'm trying to use gopter to test a simple stateful object. The object is initialized with no existing state, and then methods are called. After each method is called, I'd like to execute a check on the internal state of the object. Note this means I don't want to maintain a separate State object, I'd basically like my SystemUnderTest to also act as the State object and I don't want to have to do any kind of updates in NextState. Ideally the Run method will be called, executing one of the object's methods, and then PostCondition will check some property of the object.

I believe I managed to get this working, but a few things that aren't clear:

Why does InitialStateGen run so many times? I would expect it to run once for each set of commands being executed but it seems to run much more than that.
Is it wrong to try and use the commands.SystemUnderTest as the commands.State? If I try to do so, it doesn't seem to work, since the commands.State and the commands.SystemUnderTests objects seem to come from different calls to InitialStateGen, so the commands.State doesn't see the changes effected to the commands.SystemUnderTest during calls to Run. I can work around this by returning the SystemUnderState as the result in Run and then checking the result in PostCondition while just ignoring the State. Is that the right thing to do? Related https://github.com/leanovate/gopter/issues/25

I've included a minimum viable example below.

Thanks!

package keeper

import (
	"fmt"
	"testing"

	"github.com/leanovate/gopter"
	"github.com/leanovate/gopter/commands"
	"github.com/leanovate/gopter/gen"
)

type sys struct {
	height  int
	heights map[int]uint8
}

// store i under the latest height value and incremenet the height value
func (s *sys) New(i uint8) {
	h := s.height
	s.height += 1

	// bug when height is 5
	if s.height == 5 {
		return
	}

	s.heights[h] = i
}

type newCommand uint8

func (value newCommand) Run(q commands.SystemUnderTest) commands.Result {
	s := q.(*sys)
	s.New(uint8(value))
	return s
}

// expect there to be an entry in the `sys.heights` map for each height up to the `sys.height`
func (newCommand) PostCondition(state commands.State, result commands.Result) *gopter.PropResult {
	// XXX: if I try to use the state instead of the result here, it doesn't work
	// (state doesn't seem to see the effects from Run)
	s := result.(*sys)
	for i := 0; i < s.height; i++ {
		_, ok := s.heights[i]
		if !ok {
			return &gopter.PropResult{Status: gopter.PropFalse}
		}
	}
	return &gopter.PropResult{Status: gopter.PropTrue}
}

func (value newCommand) NextState(state commands.State) commands.State {
	s := state.(*sys)
	return s
}

func (newCommand) PreCondition(state commands.State) bool {
	return true
}

func (value newCommand) String() string {
	return fmt.Sprintf("New(%d)", value)
}

var genNewCommand = gen.UInt8().Map(func(value uint8) commands.Command {
	return newCommand(value)
})

func cbCommands(t *testing.T) *commands.ProtoCommands {
	return &commands.ProtoCommands{
		NewSystemUnderTestFunc: func(initState commands.State) commands.SystemUnderTest {
			return initState
		},
		InitialStateGen: func(p *gopter.GenParameters) *gopter.GenResult {
			// XXX: this function seems to execute many more times than Run
			result := &sys{
				height:  0,
				heights: make(map[int]uint8),
			}
			return gopter.NewGenResult(result, gopter.NoShrinker)
		},
		GenCommandFunc: func(state commands.State) gopter.Gen {
			return gen.OneGenOf(genNewCommand)
		},
	}
}

func TestSys(t *testing.T) {
	parameters := gopter.DefaultTestParametersWithSeed(1235)
	properties := gopter.NewProperties(parameters)
	properties.Property("circular buffer", commands.Prop(cbCommands(t)))
	properties.TestingRun(t)
}

Nov 06 '20 07:11 ebuchman

I think there is a slight misconception between SystemUnderTest and State:

The SystemUnderTest is supposed to be ... well any kind of stateful system that should be tested. In the most simple case it is just some kind of struct with a lot of methods modifying its fields, in this case a Command is just a call to one of these methods. In a more complex scenario it might be some sort of state machine processing events, in this case Command is sending one of these events.

The State is purely meant for the test itself to model the expected state the SystemUnderTest is supposed to be in after a sequence of commands has been executed. I.e. it a simplified "outside-view" state of the SystemUnderTest that could be somehow verified.

Mixing these two together basically means that you are testing that the expected state is equal to the expected state ... which is kind of pointless.

Nov 06 '20 08:11 untoldwind

Thanks for the quick response! I can see how that would make sense for testing simple properties or simple aspects of the state, where I can build a simple model of it in Go, but suppose the system builds some more complex state, properties of which I'd like to check. It seems wrong to rebuild all that in the State when it's already available in the System. In the example above, my State would basically have to replicate everything already in the system, incrementing a height value every time New is called, and adding to the map. I might have a system with multiple maps and a bunch of inter dependent lookups and after a series of method calls want to check that certain invariants hold over the maps. It seems weird to try and replicate all that logic again in the State. Is that still the wrong way to think about using this?

For instance this seemed relatively straightforward when I implemented it without the Commands. Obviously the Commands should make this much more convenient when dealing with more methods:

type Event struct {
	Type uint8
	Name uint8
}

func executeEvents(s *sys, events []Event) {
	for _, ev := range events {
		method := ev.Type % 2
		switch method {
		case 0:
			s.New(ev.Name)
		case 1:
			return // no op
		}
	}
}

func postCondition(s *sys) bool {
	for i := 0; i < s.height; i++ {
		_, ok := s.heights[i]
		if !ok {
			return false
		}
	}
	return true
}

func TestSys2(t *testing.T) {
	f := func(events []Event) bool {
		s := &sys{
			height:  0,
			heights: make(map[int]uint8),
		}
		executeEvents(s, events)
		return postCondition(s)
	}

	properties := gopter.NewProperties(nil)

	// generater for events. event is just two uint8s
	eventGen := gopter.DeriveGen(
		func(typ uint8, name uint8) Event {
			return Event{typ, name}
		},
		func(o Event) (uint8, uint8) {
			return o.Type, o.Name
		},
		gen.UInt8(),
		gen.UInt8(),
	)
	eventGen = gen.SliceOf(eventGen)

	properties.Property("map state is correct", prop.ForAll(f, eventGen))

	properties.TestingRun(t)
}

Nov 06 '20 14:11 ebuchman

Ok it looks like I'm mixing concerns here. It seems like the goal of State is to build a higher-level model of the system state and its transitions useful for testing certain properties. This is great. But if the SystemUnderTest has much more internal state, we may want to test conditions about its internal consistency as well, often to details beyond what we capture in the model. So one approach might be to consider the SystemUnderTest and State (maybe should be called ModelState) more symmetrically, and for both define a New (or Init), Run (or Next), PreCondition, and PostCondition.

If there might be interest in this, I can try and summarize in a new issue and close this one, otherwise we can just close this one. Thanks!

Oh, InitialStateGen - I would expect it to run once for each set of commands being executed but it seems to run much more than that?

Nov 06 '20 19:11 ebuchman

You are right, with a complex SystemUnderTest the State (and the corresponding NextState function of the commands) might become very complex as well. The general idea is the State should focus on a specific aspect that can be verified from the "outside". E.e. if SystemUnderTest is some sort of queuing system, then State might only focus on the length of the queue rather than its contents. Or if SystemUnderTest is a statemachine, then State might only focus on the current "state-id" rather than the complex context information that might be associated with each state ... and so on ...

In short: If you find a good programming model for more complex scenarios I would be certainly interested in it.

As for InitialStateGen: This depends if the commands have preconditions and if there is shrinking. If non of the commands has a precondition and there is no shrinking, then there should be only on initial state per command sequence.

Nov 09 '20 09:11 untoldwind

gopter gopter copied to clipboard

Stateful testing: Clarification around InititalStateGen and PostConditions

gopter
gopter copied to clipboard