Tom Tucker
Tom Tucker
This was all found on Raspberry Pi, correct? I would like to ensure that this is an issue on Stria, etc... before accepting this.
@narategithub I get it but this will disable --gdb for all platforms except x86_64. So I'd just like to understand what this does and why not just let it fail...
I believe this is an issue with the MLX5/OFA install on STRIA. I don't think it has anything to do with LDMS. On Wed, Sep 14, 2022 at 1:52 PM...
@eric-roman this looks like an error-path bug. I expect that we're leaking these fd when we attempt to re-connect to a node that is down. So basically 1 fd every...
Hi @eric-roman. Could you please pull master and see if the fix works for you? Synchronous connect errors on both the sock and ugni transports were leaking fd. If that's...
The OVIS-4 fix branch is now OVIS-4 instead of master.
Ok, I see from your log of the error, it's actually not the synchronous fail path. Don't bother testing, it won't fix it. Stay tuned.
Could you ls -l /proc//fd and send me the output? I'm not able to reproduce this issue with either a host that does not respond (i.e. a bad IP address),...
Hi @eric-roman, have you had a chance to try the patch I sent? Also, when might we be able to set up a zoom to debug this live?
Hi @eric-roman has this been resolved? If so, can you close this issue.