fsm
fsm copied to clipboard
Why does the following test produce a data race?
Hi, the below test:
func TestEventAndCanInGoroutines(t *testing.T) {
fsm := NewFSM(
"closed",
Events{
{Name: "open", Src: []string{"closed"}, Dst: "open"},
{Name: "close", Src: []string{"open"}, Dst: "closed"},
},
Callbacks{},
)
wg := sync.WaitGroup{}
for i := 0; i < 10; i++ {
wg.Add(2)
go func(n int) {
defer wg.Done()
var err error
// different event per loop
if n%2 == 0 {
err = fsm.Event(context.Background(), "open")
} else {
err = fsm.Event(context.Background(), "close")
}
fmt.Println(fsm.Current())
if err != nil {
_ = err
}
}(i)
go func() {
defer wg.Done()
t.Logf("Can: %t", fsm.Can("close"))
}()
}
wg.Wait()
}
produces the following output when ran with -race enabled:
==================
WARNING: DATA RACE
Write at 0x00c0001280a0 by goroutine 13:
github.com/looplab/fsm.(*FSM).Event()
/Users/gtsiat/repos/fsm/fsm.go:372 +0x618
github.com/looplab/fsm.TestEventAndCanInGoroutines.func1()
/Users/gtsiat/repos/fsm/fsm_test.go:1062 +0xe0
github.com/looplab/fsm.TestEventAndCanInGoroutines.gowrap1()
/Users/gtsiat/repos/fsm/fsm_test.go:1069 +0x44
Previous read at 0x00c0001280a0 by goroutine 18:
github.com/looplab/fsm.(*FSM).Can()
/Users/gtsiat/repos/fsm/fsm.go:234 +0x10c
github.com/looplab/fsm.TestEventAndCanInGoroutines.func2()
/Users/gtsiat/repos/fsm/fsm_test.go:1073 +0x8c
Goroutine 13 (running) created at:
github.com/looplab/fsm.TestEventAndCanInGoroutines()
/Users/gtsiat/repos/fsm/fsm_test.go:1055 +0x348
testing.tRunner()
/opt/homebrew/Cellar/go/1.22.1/libexec/src/testing/testing.go:1689 +0x180
testing.(*T).Run.gowrap1()
/opt/homebrew/Cellar/go/1.22.1/libexec/src/testing/testing.go:1742 +0x40
Goroutine 18 (running) created at:
github.com/looplab/fsm.TestEventAndCanInGoroutines()
/Users/gtsiat/repos/fsm/fsm_test.go:1071 +0x214
testing.tRunner()
/opt/homebrew/Cellar/go/1.22.1/libexec/src/testing/testing.go:1689 +0x180
testing.(*T).Run.gowrap1()
/opt/homebrew/Cellar/go/1.22.1/libexec/src/testing/testing.go:1742 +0x40
==================
The exact command used to run the test was:
go clean -testcache && go test -v -race -run TestEventAndCanInGoroutines
By replacing the stateMu.RLock and stateMu.RUnlock with stateMu.Lock and stateMu.Unlock in the func (f *FSM) Can(event string) bool method, the issue seems to get fixed, but I would lie if I knew that I understand why. Can someone that has experience with this repo explain why it happens?
The problem is that Event() tries to assign the f.transition to transitionFunc() while Can() is using it. Locking the event mutex should fix the issue.
Fixed by #107.