opentelemetry-php icon indicating copy to clipboard operation
opentelemetry-php copied to clipboard

OTLP/GRPC exporter implementation - Missing Reries on Exporter errors

Open sanketmehta28 opened this issue 2 years ago • 14 comments

Is your feature request related to a problem? Please describe. The OTLP exporter specification says that transient errors MUST be handled with a retry strategy, where the retriable gRPC error codes are defined in the OTLP spec.

Fixes: #328

Describe the solution you'd like I think we should implement generic retry policy which can be utilized by all exporters

Additional context In grpc, retry is supported out of the box using service config but in this PR I have tried to provide the generic solution which can be used by all the exporters irrespective of the transport method.

sanketmehta28 avatar May 19 '22 14:05 sanketmehta28

Thanks for opening your first pull request! If you haven't yet signed our Contributor License Agreement (CLA), then please do so that we can accept your contribution. A link should appear shortly in this PR if you have not already signed one.

welcome[bot] avatar May 19 '22 14:05 welcome[bot]

@sanketmehta28 Hi, thanks for your contribution and sorry for the late response. Can you please fix the CS issues, so we can review the PR? (The easiest way is just to call make style in the repo root dir)

tidal avatar May 20 '22 16:05 tidal

There are some psalm errors appearing in CI

bobstrecansky avatar May 22 '22 22:05 bobstrecansky

Codecov Report

Merging #677 (2a31ed5) into main (e946dc7) will decrease coverage by 0.09%. The diff coverage is 77.45%.

Impacted file tree graph

@@             Coverage Diff              @@
##               main     #677      +/-   ##
============================================
- Coverage     79.52%   79.43%   -0.10%     
- Complexity     1405     1434      +29     
============================================
  Files           160      162       +2     
  Lines          3468     3564      +96     
============================================
+ Hits           2758     2831      +73     
- Misses          710      733      +23     
Flag Coverage Δ
7.4 79.43% <77.45%> (-0.10%) :arrow_down:
8.0 79.49% <77.45%> (-0.10%) :arrow_down:
8.1 79.49% <77.45%> (-0.10%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/SDK/Common/Time/BlockingScheduler.php 0.00% <0.00%> (ø)
.../Common/Retry/ExponentialWithJitterRetryPolicy.php 76.66% <76.66%> (ø)
src/SDK/Trace/Behavior/SpanExporterTrait.php 87.50% <83.33%> (-12.50%) :arrow_down:
src/Contrib/OtlpGrpc/Exporter.php 91.30% <100.00%> (+1.14%) :arrow_up:
src/SDK/Trace/SpanExporter/LoggerExporter.php 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update e946dc7...2a31ed5. Read the comment docs.

codecov[bot] avatar May 30 '22 13:05 codecov[bot]

Hi, Can anyone help me resolve this code coverage error?

sanketmehta28 avatar May 30 '22 14:05 sanketmehta28

Hi, Can anyone help me resolve this code coverage error?

It's complaining about a decrease in code coverage, specifically in SpanExporterTrait - you can make test-coverage locally, you will need to add/change some unit tests to ensure code coverage is not reduced by this PR.

brettmc avatar May 31 '22 03:05 brettmc

Just an explanation for not including fiber as of now: currently we are supporting php 7.4 and 8.0 which does not support fiber out of the box. so to provide retry mechanism in those php versions we have to provide an implementation without fiber and separate implementation for php 8.1 onwards with fiber. so as of now it does not make sense to provide two separate implementation.

Also I have tried the default retry mechanism of GRPC which also blocks the exporter until retry times out or successful. .net otel module is using this same behaviour(PR) so it will not make much difference if this generic retry mechanism blocks the exporter.

sanketmehta28 avatar Jun 22 '22 10:06 sanketmehta28

I think this needs an example of how to use and configure a retry policy, so that users can see it in action and understand how to configure (and disable/turn off)

brettmc avatar Jun 22 '22 13:06 brettmc

I think this needs an example of how to use and configure a retry policy, so that users can see it in action and understand how to configure (and disable/turn off)

Hi @brettmc Sure I will create one and let you know

sanketmehta28 avatar Jun 28 '22 12:06 sanketmehta28

@tidal @Nevay: as you guys can see, most of the builds are getting failed because of doLog() function in LogsMessagetrait.php because doLog is also being defined in LoggerExporter.php but from loggerexporter.php it is trying to refer to doLog() function in LogsMessageTrait.php which is not right.

I believe it is a minor bug and we need to create an issue for the same. Uploading Screenshot 2022-07-18 at 7.12.30 PM.png…

sanketmehta28 avatar Jul 18 '22 13:07 sanketmehta28

failed because of doLog() function in LogsMessagetrait.php

I understand now... By adding LogsMessagesTrait to SpanExporterTrait, our LoggerExporter ends up with two methods named doLog. In that case, it does seem reasonable to rename one of them as part of this PR (I'd nominate the one in LoggerExporter). There have been a couple of unintended code changes in this PR so we were probably over-zealous in checking for them :)

brettmc avatar Jul 19 '22 02:07 brettmc

failed because of doLog() function in LogsMessagetrait.php

I understand now... By adding LogsMessagesTrait to SpanExporterTrait, our LoggerExporter ends up with two methods named doLog. In that case, it does seem reasonable to rename one of them as part of this PR (I'd nominate the one in LoggerExporter). There have been a couple of unintended code changes in this PR so we were probably over-zealous in checking for them :)

I will change in loggerExporter in this PR

sanketmehta28 avatar Jul 19 '22 13:07 sanketmehta28

@Nevay @tidal Please do review it and approve

sanketmehta28 avatar Aug 03 '22 12:08 sanketmehta28

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Oct 01 '22 02:10 stale[bot]

This issue has been automatically closed because it has not had recent activity, but it can be reopened. Thank you for your contributions.

stale[bot] avatar Oct 16 '22 01:10 stale[bot]