results icon indicating copy to clipboard operation
results copied to clipboard

[WIP] finalizer approach to fix race condition due to pruning in results watcher

Open ramessesii2 opened this issue 1 year ago • 10 comments

Changes

/kind bug

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you review them:

  • [ ] Has Docs included if any changes are user facing
  • [ ] Has Tests included if any functionality added or changed
  • [x] Tested your changes locally (if this is a code change)
  • [x] Follows the commit message standard
  • [x] Meets the Tekton contributor standards (including functionality, content, code)
  • [x] Has a kind label. You can add a comment on this PR that contains /kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tep
  • [x] Release notes block below has been updated with any user-facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings)
  • [x] Release notes contain the string "action required" if the change requires additional action from users switching to the new release

Release Notes

NONE

ramessesii2 avatar Feb 05 '24 12:02 ramessesii2

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: To complete the pull request process, please assign sayan-biswas after the PR has been reviewed. You can assign the PR to them by writing /assign @sayan-biswas in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

tekton-robot avatar Feb 05 '24 12:02 tekton-robot

Hi @ramessesii2. Thanks for your PR.

I'm waiting for a tektoncd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tekton-robot avatar Feb 05 '24 12:02 tekton-robot

The following is the coverage report on the affected files. Say /test pull-tekton-results-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/watcher/reconciler/dynamic/dynamic.go 69.3% 60.8% -8.5

tekton-robot avatar Feb 05 '24 13:02 tekton-robot

The following is the coverage report on the affected files. Say /test pull-tekton-results-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/watcher/reconciler/dynamic/dynamic.go 69.3% 60.2% -9.1

tekton-robot avatar Feb 08 '24 14:02 tekton-robot

The following is the coverage report on the affected files. Say /test pull-tekton-results-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/watcher/reconciler/dynamic/dynamic.go 69.3% 61.0% -8.4

tekton-robot avatar Feb 12 '24 11:02 tekton-robot

@ramessesii2: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-tekton-results-build-tests 52aafe71d0b8659f226c5eb4d4b21e8f88dc1728 link true /test pull-tekton-results-build-tests
pull-tekton-results-integration-tests 52aafe71d0b8659f226c5eb4d4b21e8f88dc1728 link true /test pull-tekton-results-integration-tests

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

tekton-robot avatar Feb 12 '24 12:02 tekton-robot

hi @gabemontero @adambkaplan with finalizers, at least locally, I've been able to fix the race condition. There's a small caveat with finalizer. Bcs we add finalizer for the PR to not get pruned until the streaming/sending of logs is done. Finalizer along with the PipelineRun object is stored as well. While that might not be a deal breaker (I'm not sure though), but I find it simpler to utilize this field from LogStatus to simply hold on to pruning until we find isStored: true.

ramessesii2 avatar Feb 14 '24 15:02 ramessesii2

hi @gabemontero @adambkaplan with finalizers, at least locally, I've been able to fix the race condition. There's a small caveat with finalizer. Bcs we add finalizer for the PR to not get pruned until the streaming/sending of logs is done. Finalizer along with the PipelineRun object is stored as well. While that might not be a deal breaker (I'm not sure though), but I find it simpler to utilize this field from LogStatus to simply hold on to pruning until we find isStored: true.

for me at least I like using IsStored() instead as well @ramessesii2 @adambkaplan @ramessesii2 @sayan-biswas assuming the watcher reconciler that handles pruning can watch for that and requeue if it is not yet stored

I believe that is the case based on what I recall from the fix for handling cancelled pipeline/task runs

putting metadata i.e. in the object we are storing seems more fragile to me in hindsight

what do you all think?

perhaps as part of breaking out the mem leak fix from this one, either create a separate PR or a separate commit in this PR so we can compare IsStored vs. finalizer

gabemontero avatar Feb 14 '24 15:02 gabemontero

FYI : #713 uses Logs API to address race condition

ramessesii2 avatar Feb 16 '24 01:02 ramessesii2

@ramessesii2: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tekton-robot avatar Feb 27 '24 20:02 tekton-robot