fsm icon indicating copy to clipboard operation
fsm copied to clipboard

Why does the following test produce a data race?

Open Gtsiat opened this issue 1 year ago • 1 comments

Hi, the below test:

func TestEventAndCanInGoroutines(t *testing.T) {
	fsm := NewFSM(
		"closed",
		Events{
			{Name: "open", Src: []string{"closed"}, Dst: "open"},
			{Name: "close", Src: []string{"open"}, Dst: "closed"},
		},
		Callbacks{},
	)

	wg := sync.WaitGroup{}
	for i := 0; i < 10; i++ {
		wg.Add(2)
		go func(n int) {
			defer wg.Done()
			var err error
			// different event per loop
			if n%2 == 0 {
				err = fsm.Event(context.Background(), "open")
			} else {
				err = fsm.Event(context.Background(), "close")

			}
			fmt.Println(fsm.Current())
			if err != nil {
				_ = err
			}
		}(i)

		go func() {
			defer wg.Done()
			t.Logf("Can: %t", fsm.Can("close"))
		}()
	}

	wg.Wait()
}

produces the following output when ran with -race enabled:

==================
WARNING: DATA RACE
Write at 0x00c0001280a0 by goroutine 13:
  github.com/looplab/fsm.(*FSM).Event()
      /Users/gtsiat/repos/fsm/fsm.go:372 +0x618
  github.com/looplab/fsm.TestEventAndCanInGoroutines.func1()
      /Users/gtsiat/repos/fsm/fsm_test.go:1062 +0xe0
  github.com/looplab/fsm.TestEventAndCanInGoroutines.gowrap1()
      /Users/gtsiat/repos/fsm/fsm_test.go:1069 +0x44

Previous read at 0x00c0001280a0 by goroutine 18:
  github.com/looplab/fsm.(*FSM).Can()
      /Users/gtsiat/repos/fsm/fsm.go:234 +0x10c
  github.com/looplab/fsm.TestEventAndCanInGoroutines.func2()
      /Users/gtsiat/repos/fsm/fsm_test.go:1073 +0x8c

Goroutine 13 (running) created at:
  github.com/looplab/fsm.TestEventAndCanInGoroutines()
      /Users/gtsiat/repos/fsm/fsm_test.go:1055 +0x348
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.22.1/libexec/src/testing/testing.go:1689 +0x180
  testing.(*T).Run.gowrap1()
      /opt/homebrew/Cellar/go/1.22.1/libexec/src/testing/testing.go:1742 +0x40

Goroutine 18 (running) created at:
  github.com/looplab/fsm.TestEventAndCanInGoroutines()
      /Users/gtsiat/repos/fsm/fsm_test.go:1071 +0x214
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.22.1/libexec/src/testing/testing.go:1689 +0x180
  testing.(*T).Run.gowrap1()
      /opt/homebrew/Cellar/go/1.22.1/libexec/src/testing/testing.go:1742 +0x40
==================

The exact command used to run the test was:

go clean -testcache && go test -v -race -run TestEventAndCanInGoroutines

By replacing the stateMu.RLock and stateMu.RUnlock with stateMu.Lock and stateMu.Unlock in the func (f *FSM) Can(event string) bool method, the issue seems to get fixed, but I would lie if I knew that I understand why. Can someone that has experience with this repo explain why it happens?

Gtsiat avatar Apr 18 '24 17:04 Gtsiat

The problem is that Event() tries to assign the f.transition to transitionFunc() while Can() is using it. Locking the event mutex should fix the issue.

denisvaloha avatar Jun 07 '24 02:06 denisvaloha

Fixed by #107.

maxekman avatar Aug 09 '24 11:08 maxekman