monero-gui icon indicating copy to clipboard operation
monero-gui copied to clipboard

DaemonManager: prevent GUI from hanging on hanging `monerod` launch

Open hinto-janai opened this issue 7 months ago • 3 comments

Fixes https://github.com/monero-project/monero-gui/issues/4240

Problem 1

monerod hangs on this line when creating an http connection (for local json-rpc): https://github.com/monero-project/monero/blob/ac02af92867590ca80b2779a7bbeafa99ff94dcb/src/common/rpc_client.h#L125

using the generic 3 minutes and 30 seconds connection timeout: https://github.com/monero-project/monero/blob/ac02af92867590ca80b2779a7bbeafa99ff94dcb/src/common/http_connection.h#L42-L45

This causes direct json-rpc invocation (./monerod sync_info) to hang for 3m30s before exiting if monerod can't bind.

Problem 2

Monero GUI relies on direct json-rpc invocation to tell if monerod is "running": https://github.com/monero-project/monero-gui/blob/e9cd4588aef3f0808ce153957a504f43dcdbeb26/src/daemon/DaemonManager.cpp#L239-L245

This causes the GUI to hang and the Watcher is not quite in sync so it'll hang (in 3m30s chunks) until the 2 miraculously sync up.

Change

When launching monerod, launch it with a timeout of 5 seconds before assuming something has gone wrong and return false, as in we failed to launch.

This lowers the "is monerod running?" poll-rate to every 5 seconds which makes the error screen surface around the 120s mark as intended.

hinto-janai avatar Nov 14 '23 01:11 hinto-janai

could you help me reproduce the issue? i had first tried to bind a monero-wallet-rpc client to the same port the 'launched by monero-gui' monerod would attempt to use for its --zmq-port, but i noticed the 120 second timeout

plowsof avatar Nov 17 '23 21:11 plowsof

  1. Block port 18081/18083
  2. Start monerod from Monero GUI with no special arguments
  3. See no error after 120 second timeout

https://github.com/monero-project/monero-gui/assets/101352116/fe073827-4bb2-4f62-9704-0a26b6258546

hinto-janai avatar Nov 19 '23 14:11 hinto-janai

$ nc -lp 18081 > /dev/null

i still see the failed to start after 120 seconds, but im half certain i experienced a loop longer than 120 , strange (would this be me encountering the 3 min 30 second timeout?) Screencast from 15-01-24 08:16:14.webm

plowsof avatar Jan 15 '24 08:01 plowsof