polygon-edge icon indicating copy to clipboard operation
polygon-edge copied to clipboard

Recurring Error messages `Unable to process flush` / `Unable to write WS message` still exist

Open dankostiuk opened this issue 2 years ago • 4 comments

Recurring Error messages Unable to process flush / Unable to write WS message still exist

Description

The issue I originally opened in https://github.com/0xPolygon/polygon-edge/issues/486 still exists.

We are seeing the following errors when a client connected by WS closes prematurely:

[ERROR] polygon.filter: Unable to process flush, writev tcp 127.0.0.1:5000->127.0.0.1:52646: use of closed network connection
[ERROR] polygon.jsonrpc: Unable to write WS message, writev tcp 127.0.0.1:5000->127.0.0.1:52646: use of closed network connection

These errors only go away on node restart.


The fix for my original issue seems to only address websocket: close sent messages as mentioned in the PR below: https://github.com/0xPolygon/polygon-edge/pull/487

Your environment

  • OS and version Ubuntu 20
  • version of the Polygon Edge 0.4.1
  • branch that causes this issue develop

Steps to reproduce

  1. Connect a client directly to a node + subscribe to a WS connected (e.g. Blockscout explorer)
  2. Terminate the client process
  3. Observe recurring Unable to process flush on the host node

Expected behaviour

I would only expect an initial error indicating an issue with the WS connection, rather than seeing the same error persist requiring a node restart.

Actual behaviour

The same error persists requiring a node restart.

dankostiuk avatar Jul 13 '22 14:07 dankostiuk

Hey @dankostiuk, thank you for opening this issue. This is a known problem which luckily has already been resolved. Here is the PR with fix (https://github.com/0xPolygon/polygon-edge/pull/570), it's only waiting on reviews :)

0xAleksaOpacic avatar Jul 13 '22 18:07 0xAleksaOpacic

Hey @Aleksao998 ,

We are still seeing loads of these type of errors on nodes targeted by our testnet blockscout instance:

[ERROR] polygon.jsonrpc: Unable to write WS message, writev tcp 172.31.4.12:10002->54.234.132.166:56094: use of closed network connection"}

and

[ERROR] polygon.filter: Unable to process flush, writev tcp 172.31.4.12:10002->54.234.132.166:38924: use of closed network connection"}

These errors continue indefinitely until we restart the edge service. Is there anything we can provide to help re-investigate? There errors do not break anything, they just add noise to our on-call alerts and I feel it's better to have the issue resolved than to filter them out for our alerts.

Thanks!

dankostiuk avatar Aug 22 '22 19:08 dankostiuk

Hey @Aleksao998 ,

We are still seeing loads of these type of errors on nodes targeted by our testnet blockscout instance:

[ERROR] polygon.jsonrpc: Unable to write WS message, writev tcp 172.31.4.12:10002->54.234.132.166:56094: use of closed network connection"}

and

[ERROR] polygon.filter: Unable to process flush, writev tcp 172.31.4.12:10002->54.234.132.166:38924: use of closed network connection"}

These errors continue indefinitely until we restart the edge service. Is there anything we can provide to help re-investigate? There errors do not break anything, they just add noise to our on-call alerts and I feel it's better to have the issue resolved than to filter them out for our alerts.

Thanks!

Hey @dankostiuk Thank you for raising the issue, I agree we should fix this. We probably have some other place where we do not close connection. I will investigate this as soon as possible

0xAleksaOpacic avatar Aug 22 '22 20:08 0xAleksaOpacic

Appreciate that @Aleksao998 , do you mind re-opening this issue just so it doesn't get lost? Otherwise I could re-create and link to this one - let me know.

dankostiuk avatar Aug 23 '22 17:08 dankostiuk

found the same error in our local rpc node

hhq365 avatar Sep 07 '22 08:09 hhq365

Hi guys, any luck with this? The spam is quite relentless and we have to add strange error filtering logic to ignore these but catch other errors with monitoring service.s

mrwillis avatar Sep 20 '22 14:09 mrwillis

Hello everyone. Just to give you a short update that this is on our to-do list, and we will try to provide you with an update from the beginning of the next week.

ivanbozic21 avatar Sep 21 '22 07:09 ivanbozic21

I've created a new PR. I confirmed no more logs appeared after WebSocket connection suddenly terminated

Kourin1996 avatar Sep 26 '22 13:09 Kourin1996

Hi @dankostiuk, we wish to confirm that this issue is resolved and we will close now this request.

Thank you!

ivanbozic21 avatar Oct 03 '22 12:10 ivanbozic21