node
node copied to clipboard
Providers stops seeing new deployments after running a while
Early on after launching mainnet we seemed to have a problem where the providers would stop seeing deployment & thus not bid. This more or less went away.
On the current edgenet I'm seeing this problem again. At first the provider bids just fine. After running a while there is nothing happening when I make a new deployment. It's on chain, but the provider doesn't even attempt to bid.
The RPC nodes are running in one datacenter & the provider is running in another. The primary difference between this edgenet & mainnet is that the mainnet obviously has much more transactions per block. This results in more events being sent.
While this may just be 1 problem of many, what I'm seeing here is that as long as messages continue to come in at a regular rate then the events stream to the provider keeps working. If there is long idle period, it is broken. But it is broken silently. This leads me to conclude
- We need to enable TCP Keep alive on that connection, or send WebSocket PING/PONG messages if we aren't already
- There is a bug in tendermint/cosmos that causes it to stop sending messages to an event stream entirely after an idle period but it doesn't close the connection