go-textile
go-textile copied to clipboard
Flaky dials
Describe the bug Dialing a remote peer often fails at relatively steady intervals. The remote peer stops accepting connections for ~1-2 min, after which everything starts working again. At least I can confirm that the remote is actually not responding.
During this downtime,
curl -v telnet://104.210.43.77:4001
will hang. Once it starts responding again (with /multistream/1.0.0
), our service layer calls succeed.
To Reproduce Run a desktop peer in debug mode and register a remote cafe. After awhile, you should notice that dials start to fail:
16:31:04.149 DEBUG tex-servic: sending CAFE_CHECK_MESSAGES to 12D3KooWSsM117bNw6yu1auMfNqeu59578Bct5V4S9fWxavogrsw main.go:102
16:31:04.149 INFO tex-servic: error writing message, trying again: stream reset sender.go:165
16:31:21.089 DEBUG tex-servic: received pubsub CAFE_PUBSUB_QUERY from 12D3KooWRWXWWWuUKP6u8Fdq9FvyHKz6ksWys5cPoGVwhMMGUp1f main.go:502
16:31:34.151 ERROR tex-core: error checking messages: context deadline exceeded main.go:713
16:31:34.151 DEBUG tex-core: flushing downloads block_downloads.go:53
16:31:34.151 DEBUG tex-core: handling 0 downloads block_downloads.go:61
16:31:34.582 DEBUG tex-servic: received pubsub CAFE_PUBSUB_QUERY from 12D3KooWGN8VAsPHsHeJtoTbbzsGjs2LTmQZ6wFKvuPich1TYmYY main.go:502
16:32:04.150 DEBUG tex-core: flushing cafe outbox cafe_outbox.go:172
16:32:04.151 DEBUG tex-core: handling 0 cafe requests cafe_service.go:1454
16:32:04.152 DEBUG tex-servic: sending CAFE_CHECK_MESSAGES to 12D3KooWSsM117bNw6yu1auMfNqeu59578Bct5V4S9fWxavogrsw main.go:102
16:32:09.156 ERROR tex-core: error checking messages: failed to dial : all dials failed
* [/ip4/127.0.0.1/tcp/8081/ws] dial tcp 127.0.0.1:8081: connect: connection refused
* [/ip4/127.0.0.1/tcp/4001] dial tcp4 127.0.0.1:4001: connect: connection refused
* [/ip4/104.210.43.77/tcp/4001] dial tcp4 0.0.0.0:9113->104.210.43.77:4001: i/o timeout main.go:713
16:32:09.156 DEBUG tex-core: flushing downloads block_downloads.go:53
16:32:09.156 DEBUG tex-core: handling 0 downloads block_downloads.go:61
16:32:21.117 DEBUG tex-servic: received pubsub CAFE_PUBSUB_QUERY from 12D3KooWRWXWWWuUKP6u8Fdq9FvyHKz6ksWys5cPoGVwhMMGUp1f main.go:502
16:32:32.606 DEBUG tex-servic: received pubsub CAFE_PUBSUB_QUERY from 12D3KooWLnUv9MWuRM6uHirRPBM4NwRj54n4gNNnBtiFiwPiv3Up main.go:502
16:32:48.116 DEBUG tex-servic: received pubsub CAFE_PUBSUB_QUERY from 12D3KooWNjRYvJza7dFx4YTf9kUYBUcbefBDexhBJ3KYHGrC1dtG main.go:502
16:33:04.152 DEBUG tex-core: flushing cafe outbox cafe_outbox.go:172
16:33:04.152 DEBUG tex-core: handling 0 cafe requests cafe_service.go:1454
16:33:04.154 DEBUG tex-servic: sending CAFE_CHECK_MESSAGES to 12D3KooWSsM117bNw6yu1auMfNqeu59578Bct5V4S9fWxavogrsw main.go:102
16:33:04.704 DEBUG tex-servic: received CAFE_MESSAGES response from 12D3KooWSsM117bNw6yu1auMfNqeu59578Bct5V4S9fWxavogrsw main.go:130
Expected behavior All dials should succeed.