keripy icon indicating copy to clipboard operation
keripy copied to clipboard

kli local watch hangs instead of timing out

Open dhh1128 opened this issue 2 years ago • 2 comments

Something is currently wrong with Provenant's witness4. When I attempt to curl it, it doesn't respond. This has created a condition that exposes some undesirable behaviors in kli watch.

When I run kli local watch, the command gets to the place in the first alias's witness list where it's attempting to talk to Provenant's witness4, and hangs. It never times out. The only way to continue is to press CTRL+C, which aborts the rest of the command. This means I can't get any watching behavior for any other aliases, or on the fifth witness for the first alias, because of the hang on the 4th witness of the first alias.

This is distinct from the behavior that happens if I have an alias that is damaged. I have one of those. When the watch command runs for that alias, it contacts each witness in its list, waits for a while, fails to receive any events, and writes a red error message, then continues on to the next witness.

I suggest that the command needs to be altered in two ways:

It should be possible to watch a specific alias rather than just all of them. We need to use a timeout to prevent hangs. (I'm not sure why the timeout works for damaged aliases but not for unresponsive witnesses.)

dhh1128 avatar Jan 26 '23 12:01 dhh1128

Update: what was wrong with Provenant's witness was that it was out of disk space; this caused several problems including making the docker daemon unresponsive. We've now fixed that issue, but I wanted to note its cause so we can use it to simulate similar problems when testing the fix for this bug.

dhh1128 avatar Jan 26 '23 13:01 dhh1128

Ok, I see in the code where a witness could hold up the command and we won't timeout. I'll add a timeout there as well which should close this issue

pfeairheller avatar Jan 30 '23 21:01 pfeairheller