argo-cd icon indicating copy to clipboard operation
argo-cd copied to clipboard

chore(migrate): migrate otelgrpc pkg interceptor to stats handler(#18258)

Open Jack-R-lantern opened this issue 1 year ago • 1 comments

Checklist:

  • [ ] Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this does not need to be in the release notes.
  • [ ] The title of the PR states what changed and the related issues number (used for the release note).
  • [ ] The title of the PR conforms to the Toolchain Guide
  • [ ] I've included "Closes [ISSUE #]" or "Fixes [ISSUE #]" in the description to automatically close the associated issue.
  • [ ] I've updated both the CLI and UI to expose my feature, or I plan to submit a second PR with them.
  • [ ] Does this PR require documentation updates?
  • [ ] I've updated documentation as required by this PR.
  • [ ] I have signed off all my commits as required by DCO
  • [ ] I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • [ ] My build is green (troubleshooting builds).
  • [ ] My new feature complies with the feature status guidelines.
  • [ ] I have added a brief description of why this PR is necessary and/or what this PR solves.
  • [ ] Optional. My organization is added to USERS.md.
  • [ ] Optional. For bug fixes, I've indicated what older releases this fix should be cherry-picked into (this may or may not happen depending on risk/complexity).

Closes #18258

Jack-R-lantern avatar May 22 '24 17:05 Jack-R-lantern

Codecov Report

Attention: Patch coverage is 77.77778% with 2 lines in your changes missing coverage. Please review.

Project coverage is 55.76%. Comparing base (683e4e0) to head (b6f81d5).

Files with missing lines Patch % Lines
cmpserver/server.go 0.00% 1 Missing :warning:
pkg/apiclient/apiclient.go 0.00% 1 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #18366      +/-   ##
==========================================
- Coverage   55.79%   55.76%   -0.04%     
==========================================
  Files         342      341       -1     
  Lines       57213    57196      -17     
==========================================
- Hits        31920    31893      -27     
- Misses      22655    22656       +1     
- Partials     2638     2647       +9     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar May 22 '24 17:05 codecov[bot]

[previous CI integratiion tests e2e log]

time="2024-05-28T09:10:36Z" level=error msg="../../dist/argocd app diff test-app-with-secrets --local testdata --server-side-generate --plaintext --server 127.0.0.1:8088 --auth-token *** --insecure failed exit status 20: time="2024-05-28T09:10:36Z" level=fatal msg="rpc error: code = Unimplemented desc = grpc_retry: cannot retry on ClientStreams, set grpc_retry.Disable()"" execID=c6fd7 time="2024-05-28T09:10:36Z" level=fatal msg="../../dist/argocd app diff test-app-with-secrets --local testdata --server-side-generate --plaintext --server 127.0.0.1:8088 --auth-token *** --insecure failed exit status 20: time="2024-05-28T09:10:36Z" level=fatal msg="rpc error: code = Unimplemented desc = grpc_retry: cannot retry on ClientStreams, set grpc_retry.Disable()""

Modified the code based on the logs from the previous E2E test. I think this was the problem. It seems that using WithStreamInterceptor without using a chain overwrote an existing Interceptor.

Jack-R-lantern avatar May 28 '24 12:05 Jack-R-lantern

@blakepettersson Can you check on that PR?

Jack-R-lantern avatar May 29 '24 13:05 Jack-R-lantern

For reference: #17790, #17197 and #17103 were previous attempts

blakepettersson avatar May 29 '24 21:05 blakepettersson

@blakepettersson Resolved all conflicts. Please review this PR

Jack-R-lantern avatar Jul 04 '24 14:07 Jack-R-lantern

@blakepettersson The corresponding pr has been approved, but when will it be merged?

Jack-R-lantern avatar Mar 03 '25 12:03 Jack-R-lantern

@Jack-R-lantern looks like there's some duplicate work here: https://github.com/argoproj/argo-cd/pull/22098

Would y'all mind cross-reviewing and making sure everything looks right?

crenshaw-dev avatar Mar 04 '25 21:03 crenshaw-dev

@crenshaw-dev

Of course. In the server/application, the gRPC client is created using client streaming, and the important part is that the grpc_retry.Disable() option has been added. It doesn’t seem to cause any major issues in testing, but could you review this part? Everything else looks fine.

Jack-R-lantern avatar Mar 04 '25 22:03 Jack-R-lantern

@Jack-R-lantern could you rebase?

crenshaw-dev avatar Mar 10 '25 14:03 crenshaw-dev

@crenshaw-dev sure

Jack-R-lantern avatar Mar 10 '25 14:03 Jack-R-lantern

Hm. Those test failures look very unrelated to these changes.

crenshaw-dev avatar Mar 10 '25 17:03 crenshaw-dev

I'll check about that part.

Jack-R-lantern avatar Mar 10 '25 20:03 Jack-R-lantern

What some failed tests have in common is

		And(func() {
			go startCMPServer(t, "./testdata/cmp-gitsshcreds-disable-provide")
			time.Sleep(100 * time.Millisecond)
			t.Setenv("ARGOCD_BINARY_NAME", "argocd")
		}).

I wonder if the cmp server takes more to initialize with the new changes. Can you try bumping the sleep time to 1 second or more and see if the issue still persists, please?

andrii-korotkov-verkada avatar Mar 12 '25 09:03 andrii-korotkov-verkada

I see

 Unimplemented desc = grpc_retry: cannot retry on ClientStreams, set grpc_retry.Disable()

blakepettersson avatar Apr 22 '25 11:04 blakepettersson

I see

 Unimplemented desc = grpc_retry: cannot retry on ClientStreams, set grpc_retry.Disable()

It seems like the grpc_retry.Disable() option is not working as expected for a specific client-streaming gRPC method. What else would you recommend checking to debug this issue further?

Jack-R-lantern avatar Apr 22 '25 11:04 Jack-R-lantern