aleo-setup
aleo-setup copied to clipboard
Contributor fails heartbeat when attempting to join the queue
When using the Dockerized version of the contributor, the contributor will sometimes be unable to join the queue. Meanwhile, the coordinator prints the following error:
Aug 15 16:33:24.735 ERROR aleo_setup_coordinator::api::contributor::heartbeat: Error performing heartbeat for contributor: ParticipantNotFound(aleo1fhy50w7gcqzqf8jdmptlg9mu5ctkcpqtk0ywxvchn9ey7rrg4qqqm57lm9.contributor)
...despite the fact that the contributor is clearly online. If the contributor is not online, or can't connect to the coordinator, the contributor should print an error message. If the contributor is online, the coordinator should accept them into the queue.
@ibaryshnikov i think you and @kellpossible worked on the heartbeat logic back in June, mind taking a look?
looking back through the logs, the contributor does receive a 401 "Unauthorized User" error, so maybe it has something to do with the authentication logic
So I think that what is happening is the heartbeat in only every 30 seconds or so and we drop the participant in 60 seconds, so if the first heartbeat fails at 30, if the second is not exactly at 60 seconds we are dropping it and that is what starts the problem.
I think we should either increase the heartbeat frequency to like every 10 seconds or increase the drop timeout to over 60, perhaps 120 or 180.
Aug 15 16:50:56.539 ERROR aleo_setup_coordinator::api::queue::contributor: Error inserting confirmation key for contributor with address aleo1sqm5us597r4qyrzhkpxdd2g68ftngysle8jpw7pwjkn6tt79jy8qa2s74t: UNIQUE constraint failed: participation.confirmation_key
We usually see this line in the log after the timeout :point_up:
@zosorock is this still relevant?