Failed HTTP request when old connection from HTTP pool has been reset instead of retried request
Hi,
Over time I keep observing errors like the following in the routinator logs, which prevent an RRDP repo from being updated in that run:
RRDP https://rpki-rrdp.us-east-2.amazonaws.com/rrdp/2410520f-a6ff-46b5-91e9-abd919dd2d6e/notification.xml: error sending request for url (https://rpki-rrdp.us-east-2.amazonaws.com/rrdp/2410520f-a6ff-46b5-91e9-abd919dd2d6e/notification.xml): connection error: connection reset
RRDP https://magellan.ipxo.com/rrdp/notification.xml: error sending request for url (https://magellan.ipxo.com/rrdp/notification.xml): connection error: connection reset
RRDP https://rpki.luys.cloud/rrdp/notification.xml: error sending request for url (https://rpki.luys.cloud/rrdp/notification.xml): error trying to connect: Connection reset by peer (os error 104)
RRDP https://rrdp.apnic.net/notification.xml: error sending request for url (https://rrdp.apnic.net/notification.xml): connection error: connection reset
RRDP https://rrdp.ripe.net/notification.xml: error sending request for url (https://rrdp.ripe.net/notification.xml): connection error: connection reset
There are accompanied by the routinator_rrdp_status for the given uri having -1 as the response code. This then stays at -1 until the next request is performed.
I think it would be an improvement if routinator handled this in a different way. I think there is a pool of HTTP connections. Re-using the connections is good, but I would not expect connection resets. In any case, I would not expect a repo to be skipped for a run when an old connection is re-used.
This may be caused by keepalive not being enabled, I see https://github.com/seanmonstar/reqwest/issues/1018, and if I reproduce the commands shown there I indeed do not see the keepalive in the output.
$ docker run -it --network=container:routinator_unstable_1 fedora:36 /bin/bash
# dnf install iproute
# ss -iepn
...
# ss -iepn | grep 443
...
# or nsenter instead
sudo docker inspect rpki-client-web-tals_routinator_unstable_1 -f '{{.State.Pid}}'
# -n[etwork], -a[ll]
sudo nsenter -n -t [pid] /bin/bash
ss -iepn | grep 443