serving
serving copied to clipboard
[WIP] Stream logs from user pods during upgrades
This PR depends on https://github.com/knative/pkg/pull/2591 to be merged first
Upgrades tests make the user pods (which have queue-proxy container) restart due to updating the queue-proxy image. There can be temporary errors which disappear in the end. Without log streaming, the logs from user pods that were already shutdown are lost.
This PR pulls changes from knative.dev/pkg related to logstream which is now able to stream logs from user namespaces without filtering individual lines but taking into account only user-specified pods. This is because multiple tests are running withing the same namespace in parallel so we need to define which pods belong the the current test.
I'm sending the changes to knative.dev/pkg in parallel and will update the PR when/if they're merged. Currently the changes are hardcoded in the vendor directory.
Note: Streaming logs from the system namespace has been broken because the upgrade tests use multiple testing.T
and streaming is ended after the "ContinualTests/Setup" phase is finished. We might fix it later when https://github.com/knative/pkg/issues/2421 is implemented.
Proposed Changes
- Stream logs from user pods during upgrades
- Print the logs on error in the test log.
Release Note
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: mgencur
Once this PR has been reviewed and has the lgtm label, please assign dsimansk for approval by writing /assign @dsimansk
in a comment. For more information see:The Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
Codecov Report
Base: 86.51% // Head: 86.48% // Decreases project coverage by -0.03%
:warning:
Coverage data is based on head (
439e2c6
) compared to base (1c6a05d
). Patch has no changes to coverable lines.
Additional details and impacted files
@@ Coverage Diff @@
## main #13293 +/- ##
==========================================
- Coverage 86.51% 86.48% -0.04%
==========================================
Files 196 196
Lines 14526 14544 +18
==========================================
+ Hits 12567 12578 +11
- Misses 1661 1666 +5
- Partials 298 300 +2
Impacted Files | Coverage Δ | |
---|---|---|
pkg/reconciler/configuration/configuration.go | 82.93% <0.00%> (-2.27%) |
:arrow_down: |
pkg/reconciler/route/traffic/traffic.go | 92.55% <0.00%> (+0.08%) |
:arrow_up: |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
/hold It works only for short strings, if the name is too long the names are truncated and Pod name doesn't include the name of the Service (ObjectNameForTest). Need to get concrete pods for the given service.
When the name of the service is too long the names are as follows:
ksvc
serving-upgrades-continual-tests-setup-au-hwqgcxvu
serving-upgrades-continual-tests-setup-au-vycgximb
serverless-service:
serverlessservice/serving-upgrades-continual-tests-setup-au-hwqgcxvu-00001
deployments:
serving-upgrades-con8f7e50d9277d222233ee19efcde85c06-deployment
(remove the -deployment suffix)
pods:
pod/serving-upgrades-con8f7e50d9277d222233ee19efcde85c06-deplo4fss6
We could get the Pod names by getting the serverless-service, getting the Deployment it references, and removing the "-deployment" suffix. Then we'd get a general prefix for Pods for the given ksvc.
Another, maybe much simple way would be to simply define a short name for the ksvc in the test, instead of using test.ObjectNameForTest(t)
, we could have autoscale-sus
and autoscale-sus-tbc
and the Pods would then include the ksvc name by default (nothing would be truncated).
Trying the shorter names with the latest commit.
I think using shorter names is desirable because when the name is based on the whole test name (e.g. TestServingUpgrades/ContinualTests/SetupAutoscaleSustainingTest) all pods in the namespace have the same prefix and it's not clear which pod belongs to which test. This is bad for debugging purposes.
Background: We have AutoscaleSustainingTest and AutoscaleSustainingWithTBCTest which run in parallel. All the resulting ksvc/deployment/pods will start with serving-upgrades-continual-tests-setup-au-
followed by generated random characters. The will be many pods and not clear where they belong. This is because the names are too long and are truncated.
/retest
@mgencur: The following tests failed, say /retest
to rerun all failed tests or /retest-required
to rerun all mandatory failed tests:
Test name | Commit | Details | Required | Rerun command |
---|---|---|---|---|
build-tests_serving_main | 439e2c614cf0f8a22d91420d8c245aa6c13d4930 | link | true | /test build-tests |
upgrade-tests_serving_main | 439e2c614cf0f8a22d91420d8c245aa6c13d4930 | link | true | /test upgrade-tests |
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
This Pull Request is stale because it has been open for 90 days with
no activity. It will automatically close after 30 more days of
inactivity. Reopen with /reopen
. Mark as fresh by adding the
comment /remove-lifecycle stale
.
I will reopen this when the related knative/pkg PR is merged.