envoy
envoy copied to clipboard
Populate stream reset errors to transport failure reason
Commit Message: Populate stream reset errors to transport failure reason Additional Description: Risk Level: Low Testing: existing tests Docs Changes: n/a Release Notes: n/a Platform Specific Features: n/a
As a reminder, PRs marked as draft will not be automatically assigned reviewers, or be handled by maintainer-oncall triage.
Please mark your PR as ready when you want it to be reviewed!
/retest
/assign @alyssawilk Hey Alyssa, this PR fails the LargeResponseHeadersRejected test because with a stream error set by QUIC, the stream reset will not be considered a codec error.
It feels like this PR actually breaks Envoy's behavior. I'd like to get some inputs from you on how to proceed.
yeah. I think the problem is in UpstreamRequestFilterManagerCallbacks::resetStream where we actually use the details. The easiest option would be to include codec error in details and change the equality check for a contains check. A better fix would be to add codec error to one of the booleans in stream info (probably response flags), and use that as a more consistent and safe check of what went south. If you'd be game for the latter that'd be fantastic but the former would be OK. Either way I think we runtime guard this one as higher risk /wait
@RenjieTang if you revive this when I'm out please just throw it over to Ryan?
CC @envoyproxy/runtime-guard-changes: FYI only for changes made to (source/common/runtime/runtime_features.cc).
/assign @RyanTheOptimist
yeah. I think the problem is in UpstreamRequestFilterManagerCallbacks::resetStream where we actually use the details. The easiest option would be to include codec error in details and change the equality check for a contains check. A better fix would be to add codec error to one of the booleans in stream info (probably response flags), and use that as a more consistent and safe check of what went south. If you'd be game for the latter that'd be fantastic but the former would be OK. Either way I think we runtime guard this one as higher risk /wait
I took the first route since it's less invasive
/retest