yagna icon indicating copy to clipboard operation
yagna copied to clipboard

Yagna consumes 100% CPU when in an endless closing session loop

Open grisha87 opened this issue 1 year ago • 0 comments

On my provider I noticed the following behavior: the provider was working on a task, finished it and then yagna was still consuming a lot of CPU without any vmrt process running.

Checking logs revealed a repetitve line related to session closure:

[2024-01-02T23:43:55.082+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:43:57.486+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:43:59.890+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:00.343+0100 ERROR ya_market::protocol::discovery] Error broadcasting offers: GSB failure: Broadcast failed: Unable to query neighbors: Request failed: Request timed out after 3000 ms
[2024-01-02T23:44:02.300+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:04.709+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:05.990+0100 ERROR ya_market::protocol::discovery] Error broadcasting offers: GSB failure: Broadcast failed: Unable to query neighbors: Request failed: Request timed out after 3000 ms
[2024-01-02T23:44:07.112+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:09.519+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:11.924+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:14.328+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:16.734+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:19.144+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:21.549+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:22.205+0100 ERROR ya_market::protocol::discovery] Error broadcasting offers: GSB failure: Broadcast failed: Unable to query neighbors: Request failed: Request timed out after 3000 ms
[2024-01-02T23:44:23.955+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:26.358+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.
[2024-01-02T23:44:28.762+0100 INFO  ya_relay_client::session::expire] Closing session d59cac629ce5657ee1cee4fae45e4708 (52.48.158.112:7477) not responding to ping.

The list of sessions looks like this:

image

The IP matches the one of the relay.

Expected behavior

If yagna is about to close the session due to liveness checks not passing, it should be able to do so, without any CPU overhead.

Actual behavior

As described, it's consuming the resources on the provider which in turn will impact the provider's performance.

grisha87 avatar Jan 02 '24 22:01 grisha87