UUIDNIE

Results 10 comments of UUIDNIE
trafficstars

We seem to be experiencing a similar issue. We have dead-server-last-contact-threshold set to 3 minutes. There have been a few times now when we restart a node and all the...

Sorry, by dead vault nodes, I was meaning, a node that should have been reaped automatically by the autopilot service. I should have probably said, nodes that I expected to...

That was helpful, thank you for the response! Since the node that was unable to join the cluster was still showing up in list-peers and autopilot state hours after we...

![image](https://user-images.githubusercontent.com/6767498/195378306-54061461-aa74-4916-9c70-c607a33d475c.png) ![image](https://user-images.githubusercontent.com/6767498/195378443-657507c2-4f84-43c7-a21c-c9b3fdf13c88.png) I took these screen shots while we were experiencing the issue. I had to censor out a few things ( full node ID and hostname / port )....

Systemd was restarting the vault process on node 02 in a loop. Although the vault process was not able to unseal, I'm thinking it might have reached a state that...

Looking at the definition of last-contact-threshold according to the docs: Limit on the amount of time a server can go without leader contact before being considered unhealthy. Node 02 was...

During the time node 02 was experiencing issues it was never unsealed to my knowledge. We have thousands of messages in our central logging system that should confirm that unless...

Looking at init.go, if we unsealed, should have logged "unsealed with stored key" at info log level? Tyvm. * c.Logger().Info("unsealed with stored key") https://github.com/hashicorp/vault/blob/main/vault/init.go#L487

Can confirm, saw the same issue using Grafana 9.0.3 with Thanos 0.27.0 + Prometheus 2.32.1.

Today we noticed several production hosts this was happening with. Leaking TCP reset packets due to invalid state. A few of the hosts experiencing the issue on a regular basis...