nym
nym copied to clipboard
Handling of error case during outbound connection step in `nym_network_requester::core` crashes process
On Nym version nym-binaries-v2023.5-rolo
, the top-level process of a gateway with an embedded network requester crashes if the outwards TCP connection upon request by a Nym client results in an error:
https://github.com/nymtech/nym/blob/00600ddeeba4ed453c27bd7ea3b144d3d8560141/service-providers/network-requester/src/core.rs#L460:L477
In my case this happens when the destination domain (e.g., analytics.faw.sky.com
) doesn't resolve to an IP address (no A
record can be found). However, instead of signalling the requesting Nym client via the mixnet and gracefully returning from the function, the entire gateway process is crashed. I believe this is due to a missing shutdown.mark_as_success();
, but I might be wrong, of course.
Relevant log lines of the gateway with embedded network requester:
ERROR nym_network_requester::core > error while connecting to analytics.faw.sky.com:443: failed to lookup address information: No address associated with hostname
DEBUG TaskClient-NetworkRequester-child > the task client is getting dropped
TRACE TaskClient-NetworkRequester-child > Notifying stop on unexpected drop
As a test, I added the following lines after line 474
in above referenced code part:
if err.to_string() == "failed to lookup address information: No address associated with hostname" {
shutdown.mark_as_success();
}
which is of course overly specific to my error case, but does seem to at least prevent the gateway process from crashing. (shutdown
needs to become mut
in the function parameters).
I got this inspiration from another place in the core.rs
source file, handling a similar situation:
https://github.com/nymtech/nym/blob/00600ddeeba4ed453c27bd7ea3b144d3d8560141/service-providers/network-requester/src/core.rs#L552:L568
In particular, line 566
:
https://github.com/nymtech/nym/blob/00600ddeeba4ed453c27bd7ea3b144d3d8560141/service-providers/network-requester/src/core.rs#L566
After adding this line, the relevant log parts of the gateway with embedded network requester change to:
ERROR nym_network_requester::core > error while connecting to analytics.faw.sky.com:443: failed to lookup address information: No address associated with hostname
TRACE TaskClient-NetworkRequester-child > the task client is getting dropped (but instructed to not signal)
Could it be the case that the effort to gracefully shut down or drop tasks hadn't reached this part of the code just yet and marking shutdown
as success is all that is needed here to fix it? Or are more substantial changes needed here?