ingress-nginx icon indicating copy to clipboard operation
ingress-nginx copied to clipboard

[WIP] feat: OpenTelemetry module integration

Open esigo opened this issue 3 years ago • 3 comments

What this PR does / why we need it:

Types of changes

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] CVE Report (Scanner found CVE and adding report)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Documentation only

Which issue/s this PR fixes

#9016 step 2

How Has This Been Tested?

Checklist:

  • [ ] My change requires a change to the documentation.
  • [ ] I have updated the documentation accordingly.
  • [ ] I've read the CONTRIBUTION guide
  • [ ] I have added unit and/or e2e tests to cover my changes.
  • [ ] All new and existing tests passed.
  • [ ] Added Release Notes.

Does my pull request need a release note?

Any user-visible or operator-visible change qualifies for a release note. This could be a:

  • CLI change
  • API change
  • UI change
  • configuration schema change
  • behavioral change
  • change in non-functional attributes such as efficiency or availability, availability of a new platform
  • a warning about a deprecation
  • fix of a previous Known Issue
  • fix of a vulnerability (CVE)

No release notes are required for changes to the following:

  • Tests
  • Build infrastructure
  • Fixes for unreleased bugs

For more tips on writing good release notes, check out the Release Notes Handbook

PLACE RELEASE NOTES HERE 

esigo avatar Sep 18 '22 11:09 esigo

Hi @esigo. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Sep 18 '22 11:09 k8s-ci-robot

/kind feature

esigo avatar Oct 01 '22 21:10 esigo

/ok-to-test

@esigo , real plesure to see this coming along.

Thanks for the info posted here. But wishful thinking (don't know how much it helps yet) is post the entire sequence of steps, including collector or whatever you are using in your fork. Intention being someone may want to clone your branch and do make dev-env to experience this. At least some feedback may come out of that.

longwuyuan avatar Oct 05 '22 16:10 longwuyuan

/ok-to-test

@esigo , real plesure to see this coming along.

Thanks for the info posted here. But wishful thinking (don't know how much it helps yet) is post the entire sequence of steps, including collector or whatever you are using in your fork. Intention being someone may want to clone your branch and do make dev-env to experience this. At least some feedback may come out of that.

@longwuyuan, I updated the description. Please let me know anything is not clear. We could also arrange a session to walk through the example, if that helps.

esigo avatar Oct 06 '22 20:10 esigo

added documentation here.

esigo avatar Oct 11 '22 19:10 esigo

@esigo this is awesome. Next step is I request you create a new issue and state that the docs in this PR, need to be worked on in the context of going from a what is it currently to a more "normal worldly" set of instructions. Like copy this .... paste tht ... type that .... etc. Sans any make commands. etc etc

@rikatz , this is ready to test so PTAL

longwuyuan avatar Oct 12 '22 01:10 longwuyuan

/triage accepted

strongjz avatar Oct 13 '22 15:10 strongjz

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: esigo Once this PR has been reviewed and has the lgtm label, please ask for approval from rikatz by writing /assign @rikatz in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Nov 07 '22 19:11 k8s-ci-robot

Hi @esigo , this is brilliant work and I can’t wait to have nginx otel on an EKS cluster!

Trying to test this, hit a couple of issues, and maybe this feedback can help/ maybe I’m doing something weird.

First, there is one change in setup: I’m running this locally on a minikube cluster (with minikube tunnel running so that the ingress-nginx-controller can acquire an external IP).

I ran verbatim https://github.com/esigo/ingress-nginx/blob/otel-9016-3/docs/user-guide/third-party-addons/opentelemetry.md using the content of otel-9016-3 branch in esigo/ingress-nginx:

  1. I modified charts/ingress-nginx/values.yaml to define the extraModules name and image, then applied the helm chart helm install -f charts/ingress-nginx/values.yaml ingress-nginx charts/ingress-nginx
  2. When installing the above helm chart, the ingress-nginx-controller was created in the default namespace rather than the kube-system namespace. So, I had to apply this configmap to the default namespace (same command you specify, only with s/kube-system/default
  3. All done precisely as you state here. Note that a couple of lines require waiting for pods to be healthy (cert-manager takes a bit of time to get going) before proceeding to the next line
  4. Here I had to run $(minikube docker-env) first, so that images built locally are available within the minikube cluster. Then make images && make deploy-app worked great. (With kind instead of minikube, not necessary.)
  5. Here, no port forwarding was required (minikube tunnel), and the curl worked precisely as described. Nice demo app!
  6. Skip straight to Jaeger UI: I see beautiful spans from microapp-service and microapp-service-b. But the nginx span is missing
  7. Same in Zipkin: we see spans from microapp-service and microapp-service-b, but nothing from nginx

So, we have a nice minikube cluster with your demo app running and passing telemetry to an otel collector, and we have cool observability backends (verified Jaeger and Zipkin) making the otel viewable.

But I don’t yet have nginx otel. What bit of it isn’t working?

There are no opentelemetry logs in the ingress-nginx-controller pod:

$ kubectl logs ingress-nginx-controller-7fdf6f78df-h4dt6 -c opentelemetry
$

The controller logs don't include any mention of opentelemetry (despite configmap settings):

$ kubectl logs ingress-nginx-controller-7fdf6f78df-h4dt6
Defaulted container "controller" out of: controller, opentelemetry (init)
-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       v1.4.0
  Build:         50be2bf95fd1ef480420e2aa1d6c5c7c138c95ea
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.19.10

-------------------------------------------------------------------------------
...

I guess a custom local controller image build (including the changes in this module integration PR) is necessary here?

shaundaley39 avatar Dec 05 '22 13:12 shaundaley39

Small update:

  • where I followed the ingress-nginx dev guide (so, local build of the controller image, all running on a kind cluster), everything worked exactly as documented
  • copying across the local build of the controller image, everything worked nicely locally in minikube too
  • I of course expect this to work flawlessly IRL and I'm proceeding to push out this custom nginx ingress controller to a test EKS cluster

Thank you @esigo ! This is awesome and I can't wait for this integration to be merged!

shaundaley39 avatar Dec 08 '22 14:12 shaundaley39

@shaundaley39 thanks for testing and your kind feedback. I use k3s for testing, you may check my test setup here. There will be some changes due to #9286 and #9308. I will rebase and apply new changes when the image is promoted https://github.com/kubernetes/k8s.io/pull/4526. The new way of enabling opentelemetry will be via values: https://github.com/kubernetes/ingress-nginx/blob/2cb3ce5db68c4d3f1022068527e55c1018dccf9d/charts/ingress-nginx/values.yaml#L594

esigo avatar Dec 09 '22 21:12 esigo

Is there an update on what needs to be done to get this into production? I would love to help out if it would be something I can help with.

thomaschaaf avatar Jan 24 '23 15:01 thomaschaaf

Is there an update on what needs to be done to get this into production? I would love to help out if it would be something I can help with.

Hi @thomaschaaf, Thanks for your interest. Just review from @rikatz is missing ;)

esigo avatar Jan 24 '23 16:01 esigo

@esigo is it possible to request review by some one else? @rikatz seems not to have been active on github in the last 2 weeks. (according to the activity report on his account)

I found https://github.com/kubernetes/ingress-nginx/issues/9496#issuecomment-1418449871 which is indicating he is currently on holiday.

thomaschaaf avatar Feb 07 '23 11:02 thomaschaaf

@thomaschaaf I asked in the slack channel.

esigo avatar Feb 07 '23 22:02 esigo

+1, Waiting for that...

mhkarimi1383 avatar Feb 14 '23 15:02 mhkarimi1383

+1, waiting this :)

csepulveda avatar Feb 15 '23 22:02 csepulveda

Hey @esigo , This is great work. So i was tracing ingress-nginx with opentracing and stumbled upon this guide. Instrument nginx with OpenTelemetry. Following this i could see tracing done at nginx module level but following the steps you have mentioned, i get a single span at nginx(no internal nginx module tracing info).

Am i doing something wrong, or the guide i mentioned and the work you are doing are completely different. Or is their any way of enabling module level tracing as well.

Dipenbhatt03 avatar Feb 20 '23 13:02 Dipenbhatt03

Hi @Dipenbhatt03, We are using the other implementation. You probably need to check the documentation here https://github.com/kubernetes/ingress-nginx/pull/9144. I didn't get the question. You could see the spans for the other only when they are instrumented too.

esigo avatar Feb 22 '23 16:02 esigo

Hello @esigo, do you know when this could be merged and released? Regards!!!

csepulveda avatar Mar 07 '23 18:03 csepulveda

Hello @esigo, do you know when this could be merged and released? Regards!!!

Hi @csepulveda I'll demo this in the next SIG meeting (next Thursday). It should be merged soon after that :)

esigo avatar Mar 07 '23 18:03 esigo

status: I demoed this in the last SIG meeting (16.03.2023). Expected to be released with 1.7.0.

esigo avatar Mar 19 '23 13:03 esigo

/hold

strongjz avatar Mar 21 '23 14:03 strongjz

@strongjz merged documentation from #9144 (after addressing the review comments there).

esigo avatar Mar 21 '23 20:03 esigo

/label tide/merge-method-squash

strongjz avatar Mar 22 '23 18:03 strongjz

/lgtm

strongjz avatar Mar 22 '23 18:03 strongjz

/unhold /lgtm /approve

strongjz avatar Mar 22 '23 18:03 strongjz

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: esigo, strongjz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Mar 22 '23 18:03 k8s-ci-robot