paasta dar-1636: handle envoy admin timeout

dar-1636: handle envoy admin timeout

Open BeryJu opened this issue 2 years ago • 4 comments

Jul 13 '22 11:07 BeryJu

Hmmm, not sure if handling it that way is ideal

We should probably show the user that a particular host is not behaving nicely, instead of logging internally and just returning a potentially incomplete backends list

Jul 13 '22 11:07 TheNavigat

Hmmm, not sure if handling it that way is ideal

We should probably show the user that a particular host is not behaving nicely, instead of logging internally and just returning a potentially incomplete backends list

I dont think knowing that a particular host is bad is not really helpful for an enduser, hence I'm just kinda logging the error and moving on there. The error should be reported to us either by an active healthcheck checking envoy, or a splunk query that checks for this

Jul 13 '22 11:07 BeryJu

Wouldn't that impact the information reported by paasta itself though, potentially skipping pods/instances that should be shown?

Jul 13 '22 12:07 TheNavigat

Well it wouldn't skip anything, it would just not crash if an envoy instance is timing out, and I dont think it makes sense to show that error to users as its not actionable for them, the only thing they can do is reach out to us, where we should have direct monitoring for this instead

Jul 13 '22 12:07 BeryJu

Cleaning up and closing some very old PRs. Please re-open or nudge me if you’re still planning to work on this.

Feb 21 '24 13:02 mattmb

paasta paasta copied to clipboard

dar-1636: handle envoy admin timeout

paasta
paasta copied to clipboard