opentelemetry-demo icon indicating copy to clipboard operation
opentelemetry-demo copied to clipboard

[checkoutservice] add basic metric support

Open fatsheep9146 opened this issue 1 year ago • 5 comments

Signed-off-by: Ziqi Zhao [email protected]

related #69.

Changes

Try to enhance checkoutservice with metric support.

According to https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/semantic_conventions/rpc.md and as the first try, I first add a metrics 'rpc.server.duration' for unary service.

the metrics with four label

  • rpc.system
  • rpc.service
  • rpc.method
  • rpc.grpc.status_code

I made some experiments with new metric, I stop 'shipservice' manually to mock the fail down of shipservice and restart it after a few minutes, and draw a graph indicates the success ratio of checkout service.

the query I used is sum(rate(rpc_server_duration_count{rpc_grpc_status_code="0"}[1m])) / sum(rate(rpc_server_duration_count[1m]))

and the graph is like following with a drop in the middle

image

  • [ ] Appropriate CHANGELOG.md updated for non-trivial changes

fatsheep9146 avatar Aug 23 '22 14:08 fatsheep9146

This PR was marked stale due to lack of activity. It will be closed in 7 days.

github-actions[bot] avatar Sep 01 '22 03:09 github-actions[bot]

#339 (comment)

https://github.com/open-telemetry/opentelemetry-go-contrib/pull/2700

@reyang @austinlparker

I also started to add metrics support for grpc instrumentation library. Should we wait until this pr merged and then update checkout service's library?

fatsheep9146 avatar Sep 01 '22 04:09 fatsheep9146

This PR was marked stale due to lack of activity. It will be closed in 7 days.

github-actions[bot] avatar Sep 09 '22 03:09 github-actions[bot]

This PR was marked stale due to lack of activity. It will be closed in 7 days.

Related pull requests https://github.com/open-telemetry/opentelemetry-go-contrib/pull/2700 are still being reviewed, not stale.

fatsheep9146 avatar Sep 09 '22 04:09 fatsheep9146

This PR was marked stale due to lack of activity. It will be closed in 7 days.

github-actions[bot] avatar Sep 17 '22 03:09 github-actions[bot]

This PR was marked stale due to lack of activity. It will be closed in 7 days.

github-actions[bot] avatar Sep 26 '22 04:09 github-actions[bot]

hey @fatsheep9146 lets try to push this forward for the hero scenario asap. Our development freeze starts October 10th.

All we need in the checkout service is a gauge on the cache size + instrumentation libraries

cartersocha avatar Sep 29 '22 16:09 cartersocha

@austinlparker, @julianocosta89, @puckpuck fyi lets try to push this forward

cartersocha avatar Sep 29 '22 16:09 cartersocha

I refract this pr to use the new version of golang metric sdk and using runtime instrumentation library. Please help review this =D @cartersocha @austinlparker @puckpuck @julianocosta89

fatsheep9146 avatar Sep 30 '22 09:09 fatsheep9146

I'm investigating this - looks like part of it is that the go version in the container isn't 1.18+

austinlparker avatar Oct 03 '22 15:10 austinlparker

@austinlparker yes, but when I tried updating the go version within the Dockerfile and go.mod, the build failed with the following:

=> ERROR [builder  9/10] RUN protoc -I ./proto/ ./proto/demo.proto --go_out=./ --go-grpc_out=./                                                       0.6s
------
 > [builder  9/10] RUN protoc -I ./proto/ ./proto/demo.proto --go_out=./ --go-grpc_out=./:
#0 0.552 protoc-gen-go: program not found or is not executable
#0 0.552 Please specify a program using absolute path or make sure the program is available in your PATH system variable
#0 0.553 --go_out: protoc-gen-go: Plugin failed with status code 1.
------
failed to solve: executor failed running [/bin/sh -c protoc -I ./proto/ ./proto/demo.proto --go_out=./ --go-grpc_out=./]: exit code: 1

julianocosta89 avatar Oct 03 '22 16:10 julianocosta89

yeah, I think we've encountered a dependency hell loop. we may have to update our go stuff to use google.golang.org/protobuf rather than protoc-gen-go so we can break into 1.18+... lemme see if that's an easy upgrade

austinlparker avatar Oct 03 '22 16:10 austinlparker

@fatsheep9146 it looks like in order to take newer versions of otel-go we will need to upgrade all go services to support 1.18+. I believe we will also need to update to the newer protobuf package. Can you take care of that for this PR as well?

austinlparker avatar Oct 03 '22 16:10 austinlparker

@fatsheep9146 it looks like in order to take newer versions of otel-go we will need to upgrade all go services to support 1.18+. I believe we will also need to update to the newer protobuf package. Can you take care of that for this PR as well?

Yes, I already found this problem and started doing it. I also found that in 1.18, go get can would not install the package, so the dockerfile should also be updated. @austinlparker

fatsheep9146 avatar Oct 04 '22 01:10 fatsheep9146

@fatsheep9146 it looks like in order to take newer versions of otel-go we will need to upgrade all go services to support 1.18+. I believe we will also need to update to the newer protobuf package. Can you take care of that for this PR as well?

Yes, I already found this problem and started doing it. I also found that in 1.18, go get can would not install the package, so the dockerfile should also be updated. @austinlparker

Yeah - I would just upgrade to 1.19 across the board, as well as switch to the newer Google protobuf package. Maybe best to do that in a new PR just for cleanliness sake.

austinlparker avatar Oct 04 '22 01:10 austinlparker

@fatsheep9146 it looks like in order to take newer versions of otel-go we will need to upgrade all go services to support 1.18+. I believe we will also need to update to the newer protobuf package. Can you take care of that for this PR as well?

Yes, I already found this problem and started doing it. I also found that in 1.18, go get can would not install the package, so the dockerfile should also be updated. @austinlparker

Yeah - I would just upgrade to 1.19 across the board, as well as switch to the newer Google protobuf package. Maybe best to do that in a new PR just for cleanliness sake.

Ok, I will first submit a new pull request to update the dockerfile and protobuf version. After that pr is merged I will rebase this pr then.

fatsheep9146 avatar Oct 04 '22 01:10 fatsheep9146

Hello all,

I've just sent a PR with the required update: #406

Once, this PR is merged we can simply rebase and merge this one. Already tested locally and it is working 🥳

julianocosta89 avatar Oct 06 '22 21:10 julianocosta89

@fatsheep9146 the go version is now updated! thanks @julianocosta89

cartersocha avatar Oct 07 '22 00:10 cartersocha

@fatsheep9146 the go version is now updated! thanks @julianocosta89

I also tested it locally, it worked, thanks! @cartersocha @julianocosta89

fatsheep9146 avatar Oct 07 '22 01:10 fatsheep9146

@reyang anything missing in here?

julianocosta89 avatar Oct 07 '22 19:10 julianocosta89

@reyang anything missing in here?

Let me take a quick look.

reyang avatar Oct 07 '22 19:10 reyang

I don't know if this needs a changelog entry or not. Not a blocker though.

reyang avatar Oct 07 '22 19:10 reyang

I'll cover the changelog update in a separate PR to move this forward

cartersocha avatar Oct 07 '22 19:10 cartersocha