serving icon indicating copy to clipboard operation
serving copied to clipboard

[WIP] fix: http probe downgrades from http2 to http1 when it encounters an error

Open Cali0707 opened this issue 1 year ago • 7 comments

Fixes #15432

Proposed Changes

  • If there is an error when making the OPTIONS request, try to check readiness for HTTP1

Release Note


Cali0707 avatar Aug 01 '24 14:08 Cali0707

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Cali0707 Once this PR has been reviewed and has the lgtm label, please assign dprotaso for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

knative-prow[bot] avatar Aug 01 '24 14:08 knative-prow[bot]

/cc @dprotaso

I made a start here, but I think I am still missing something...

Specifically, if the error on the options request is because the service isn't ready yet we should try the options request again. However, I wasn't sure the best way to do that and/or figure that out. Maybe it makes sense to track how many times the options request fails before we actually set the HTTP version on the probe config? That way, it would retry the options request a few times

WDYT?

Cali0707 avatar Aug 01 '24 14:08 Cali0707

Codecov Report

:x: Patch coverage is 25.00000% with 6 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 84.66%. Comparing base (222065d) to head (cd205ce). :warning: Report is 376 commits behind head on main.

Files with missing lines Patch % Lines
pkg/queue/health/probe.go 25.00% 3 Missing and 3 partials :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #15436      +/-   ##
==========================================
+ Coverage   84.54%   84.66%   +0.11%     
==========================================
  Files         219      219              
  Lines       13595    13600       +5     
==========================================
+ Hits        11494    11514      +20     
+ Misses       1734     1710      -24     
- Partials      367      376       +9     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Aug 01 '24 14:08 codecov[bot]

I've tried to update it so that until a probe succeeds for either http1 or http2, we keep retrying the options request. This seems better, but there is still a chance of a race condition where:

  1. the options request fails because the service is not accepting requests yet
  2. the http1 probe we try succeeds because now the service started accepting requests

Cali0707 avatar Aug 01 '24 14:08 Cali0707

This Pull Request is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen with /reopen. Mark as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar Oct 31 '24 01:10 github-actions[bot]

This Pull Request is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen with /reopen. Mark as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar Jan 31 '25 01:01 github-actions[bot]

This Pull Request is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen with /reopen. Mark as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar May 02 '25 01:05 github-actions[bot]

This Pull Request is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen with /reopen. Mark as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar Aug 01 '25 01:08 github-actions[bot]