node icon indicating copy to clipboard operation
node copied to clipboard

Providers stops seeing new deployments after running a while

Open tidrolpolelsef opened this issue 2 years ago • 0 comments

Early on after launching mainnet we seemed to have a problem where the providers would stop seeing deployment & thus not bid. This more or less went away.

On the current edgenet I'm seeing this problem again. At first the provider bids just fine. After running a while there is nothing happening when I make a new deployment. It's on chain, but the provider doesn't even attempt to bid.

The RPC nodes are running in one datacenter & the provider is running in another. The primary difference between this edgenet & mainnet is that the mainnet obviously has much more transactions per block. This results in more events being sent.

While this may just be 1 problem of many, what I'm seeing here is that as long as messages continue to come in at a regular rate then the events stream to the provider keeps working. If there is long idle period, it is broken. But it is broken silently. This leads me to conclude

  1. We need to enable TCP Keep alive on that connection, or send WebSocket PING/PONG messages if we aren't already
  2. There is a bug in tendermint/cosmos that causes it to stop sending messages to an event stream entirely after an idle period but it doesn't close the connection

tidrolpolelsef avatar Jul 12 '22 17:07 tidrolpolelsef