strfry icon indicating copy to clipboard operation
strfry copied to clipboard

Relay eventually times out all events when using a writePolicy plugin

Open jb55 opened this issue 1 year ago • 3 comments

at first I thought this was some resource exhaustion in an efficient script I was using, so I rewrote my plugins in rust, but still after awhile the relay eventually just times out all incoming notes. I am looking through the plugin code and don't see anything obvious where it could go wrong 🤔

for now I have to pkill strfry every hour to fix it, but its definitely not ideal

jb55 avatar Jul 11 '24 18:07 jb55

The most likely explanation I can think of is that the plugin isn't responding to a particular event, or its response is malformed somehow. Are there any possible failure conditions in your plugin that would prevent a response from being delivered?

Around when it locks up do you see any logs like Got unparseable line from write policy plugin or maybe id mismatch? I'm going to try to think of any extra logging I can add. Having a timeout at the plugin-level is possible too, but it's slightly annoying to code up because I'm using a blocking fgets() call right now.

noteguard looks pretty sweet BTW!

hoytech avatar Jul 19 '24 03:07 hoytech

On Thu, Jul 18, 2024 at 08:25:52PM GMT, Doug Hoyte wrote:

The most likely explanation I can think of is that the plugin isn't responding to a particular event, or its response is malformed somehow. Are there any possible failure conditions in your plugin that would prevent a response from being delivered?

Only if the input doesn't parse, then it has no id to put into the output message for serialization.

I tried to simplify the loop so we'll see if that helps. Feel free to check the code. I can't see anything wrong with it:

https://github.com/damus-io/noteguard/blob/8e1bb0363f1c7729195a560ba1cd91f5b1c657c8/src/main.rs#L140-L170

As you can see noteguard.run() is guaranteed to return and write an output message for each valid input, and all of the filters are pretty simple and would never timeout.

jb55 avatar Jul 19 '24 18:07 jb55

I'm not sure TBH. I setup noteguard and it's passing all my basic testing. The buffering is correct and I can't see any obvious error cases that would de-sync the stream.

So, I made a debugging script: https://github.com/hoytech/strfry/blob/master/scripts/plugin-debug.pl

Can you please try this? You need the JSON::XS perl package (libjson-xs-perl on Debian-like distros, usually packaged elsewhere too). After it's installed, set this as the relay.writePolicy.plugin param in strfry.conf (paths are probably different):

plugin = "cd /home/user/noteguard && /home/user/strfry/scripts/plugin-debug.pl ./target/debug/noteguard"`

It detects some obvious problems, like mismatched ID, undecodeable JSON, etc. It also times out if it doesn't get a response, by default after 5 seconds. This will restart the plugin, so hopefully you won't need to pkill every hour anymore!

After it's running a while, check the strfry logs for lines starting with PLUGIN-DEBUG. Hopefully this will reveal something!

Thank you!

hoytech avatar Jul 21 '24 15:07 hoytech