spire icon indicating copy to clipboard operation
spire copied to clipboard

Fix race causing agents to fail attestation if communication is interrupted

Open evan2645 opened this issue 3 years ago • 3 comments

When an agent comes up for the first time and performs attestation, it generates a key locally, performs attestation, receives a cert for the local key, then persists this cert. If communication is interrupted between the time that the server successfully attests the agent and the time that the agent persists the new cert, then the agent can enter a state wherein it can never successfully authenticate if the node attestor in use is a TOFU attestor (because the server has already recorded a successful attestation).

This is a race that also existed in agent SVID rotation, however it was fixed with #1128 which introduced a two-step process for committing the success of an agent SVID rotation (which is also a do-once operation).

Fix the race, possibly by taking the same approach we took for agent SVID rotation (which will involve a migration).

evan2645 avatar Mar 31 '22 19:03 evan2645

This issue is stale because it has been open for 365 days with no activity.

github-actions[bot] avatar Jun 04 '24 22:06 github-actions[bot]

Still relevant.

azdagron avatar Jun 04 '24 22:06 azdagron

This issue is stale because it has been open for 365 days with no activity.

github-actions[bot] avatar Jun 05 '25 22:06 github-actions[bot]