go-libp2p icon indicating copy to clipboard operation
go-libp2p copied to clipboard

identify: flaky TestLargeIdentifyMessage

Open marten-seemann opened this issue 4 years ago • 5 comments

  === RUN   TestLargeIdentifyMessage
      id_test.go:867: test peer1 has peer2 addrs correctly
      id_test.go:882: test peer2 has peer1 addrs correctly
      id_test.go:895: testing addrs just after disconnect
      id_test.go:904: testing addrs after TTL expiration
      id_test.go:906: expected: []
      id_test.go:906: actual: [/ip4/127.0.0.1/tcp/49414 /ip4/127.0.0.1/udp/49945/quic]
      id_test.go:906: dont have the same addresse

marten-seemann avatar Aug 18 '21 15:08 marten-seemann

And it's racy as well:

  === RUN   TestLargeIdentifyMessage
  ==================
  WARNING: DATA RACE
  Read at 0x000002004798 by goroutine 143:
    github.com/libp2p/go-libp2p/p2p/protocol/identify.(*idService).consumeMessage()
        /home/runner/work/go-libp2p/go-libp2p/p2p/protocol/identify/id.go:621 +0x2d6
    github.com/libp2p/go-libp2p/p2p/protocol/identify.(*idService).handleIdentifyResponse()
        /home/runner/work/go-libp2p/go-libp2p/p2p/protocol/identify/id.go:455 +0x8a9
    github.com/libp2p/go-libp2p/p2p/protocol/identify.(*idService).pushHandler()
        /home/runner/work/go-libp2p/go-libp2p/p2p/protocol/identify/id_push.go:16 +0x4d
    github.com/libp2p/go-libp2p/p2p/protocol/identify.(*idService).pushHandler-fm()
        /home/runner/work/go-libp2p/go-libp2p/p2p/protocol/identify/id_push.go:15 +0x27
    github.com/libp2p/go-libp2p-blankhost.(*BlankHost).SetStreamHandler.func1()
        /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/blank.go:191 +0x86
    github.com/libp2p/go-libp2p-blankhost.(*BlankHost).newStreamHandler·dwrap·1()
        /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/blank.go:222 +0x74
  
  Previous write at 0x000002004798 by goroutine 76:
    github.com/libp2p/go-libp2p/p2p/protocol/identify_test.TestLargeIdentifyMessage()
        /home/runner/work/go-libp2p/go-libp2p/p2p/protocol/identify/id_test.go:753 +0x64
    testing.tRunner()
        /opt/hostedtoolcache/go/1.17.8/x64/src/testing/testing.go:1259 +0x22f
    testing.(*T).Run·dwrap·21()
        /opt/hostedtoolcache/go/1.17.8/x64/src/testing/testing.go:1306 +0x47
  
  Goroutine 143 (running) created at:
    github.com/libp2p/go-libp2p-blankhost.(*BlankHost).newStreamHandler()
        /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/blank.go:222 +0x304
    github.com/libp2p/go-libp2p-blankhost.(*BlankHost).newStreamHandler-fm()
        /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/blank.go:212 +0x4d
    github.com/libp2p/go-libp2p-swarm.(*Conn).start.func1.1()
        /home/runner/go/pkg/mod/github.com/libp2p/[email protected]/swarm_conn.go:133 +0x102
  
  Goroutine 76 (running) created at:
    testing.(*T).Run()
        /opt/hostedtoolcache/go/1.17.8/x64/src/testing/testing.go:1306 +0x726
    testing.runTests.func1()
        /opt/hostedtoolcache/go/1.17.8/x64/src/testing/testing.go:1598 +0x99
    testing.tRunner()
        /opt/hostedtoolcache/go/1.17.8/x64/src/testing/testing.go:1259 +0x22f
    testing.runTests()
        /opt/hostedtoolcache/go/1.17.8/x64/src/testing/testing.go:1596 +0x7ca
    testing.(*M).Run()
        /opt/hostedtoolcache/go/1.17.8/x64/src/testing/testing.go:1504 +0x9d1
    main.main()
        _testmain.go:87 +0x22b
  ==================
      id_test.go:804: test peer1 has peer2 addrs correctly
      id_test.go:819: test peer2 has peer1 addrs correctly
      id_test.go:832: testing addrs just after disconnect
      id_test.go:841: testing addrs after TTL expiration
      testing.go:1152: race detected during execution of test

marten-seemann avatar Apr 10 '22 16:04 marten-seemann

Also see some flakiness here: https://app.circleci.com/pipelines/github/libp2p/go-libp2p/34/workflows/aabd41b3-1759-405e-be53-b9da9703c1f2/jobs/111?invite=true#step-112-937

Which happens here: https://github.com/libp2p/go-libp2p/blob/master/p2p/protocol/identify/id_test.go#L832

That bit seems very prone to timing flakiness since the only difference between the two assertions is a sleep of one second

MarcoPolo avatar May 18 '22 16:05 MarcoPolo

Assigning myself.

schomatis avatar Jun 23 '22 23:06 schomatis

This error seems the same as the one reported in https://github.com/libp2p/go-libp2p/pull/1555#issuecomment-1144673639.

@MarcoPolo Could you give me a bit more context here on the waitForDisconnectNotification function? Should it now be replaced with https://github.com/libp2p/go-libp2p/pull/1562 as mentioned in the cited PR? (I'm new here so the more details the better, thanks.)

schomatis avatar Jun 24 '22 14:06 schomatis

Is this just a duplicate of https://github.com/libp2p/go-libp2p/issues/1523? (I might be missing the difference though.)

schomatis avatar Jun 24 '22 15:06 schomatis