yagna
yagna copied to clipboard
Yagna consumes 100% CPU when in an endless closing session loop
On my provider I noticed the following behavior: the provider was working on a task, finished it and then yagna was still consuming a lot of CPU without any vmrt process running.
Checking logs revealed a repetitve line related to session closure:
[2024-01-02T23:43:55.082+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:43:57.486+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:43:59.890+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:00.343+0100 ERROR ya_market::protocol::discovery] Error broadcasting offers: GSB failure: Broadcast failed: Unable to query neighbors: Request failed: Request timed out after 3000 ms
[2024-01-02T23:44:02.300+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:04.709+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:05.990+0100 ERROR ya_market::protocol::discovery] Error broadcasting offers: GSB failure: Broadcast failed: Unable to query neighbors: Request failed: Request timed out after 3000 ms
[2024-01-02T23:44:07.112+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:09.519+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:11.924+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:14.328+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:16.734+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:19.144+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:21.549+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:22.205+0100 ERROR ya_market::protocol::discovery] Error broadcasting offers: GSB failure: Broadcast failed: Unable to query neighbors: Request failed: Request timed out after 3000 ms
[2024-01-02T23:44:23.955+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:26.358+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:28.762+0100 INFO ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
The list of sessions looks like this:
The IP matches the one of the relay.
Expected behavior
If yagna is about to close the session due to liveness checks not passing, it should be able to do so, without any CPU overhead.
Actual behavior
As described, it's consuming the resources on the provider which in turn will impact the provider's performance.