ingress-nginx icon indicating copy to clipboard operation
ingress-nginx copied to clipboard

Support none keyword in log-format escape

Open knbnnate opened this issue 3 years ago • 20 comments

What this PR does / why we need it:

ingress-nginx does not support disabling escaping of special characters in the nginx log. This PR exposes the setting to support that functionality.

Types of changes

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Documentation only

Which issue/s this PR fixes

fixes #8693

How Has This Been Tested?

Followed the getting-started guide. Used ppa:longsleep/golang-backports on WSL Ubuntu to establish a golang-1.18 environment with latest docker and recommended kind. Built the dev-env successfully; had issues with make test, but they are entirely unrelated to anything I touched. Ultimate test was

FOCUS=log-format make kind-e2e-test
...
Ginkgo ran 1 suite in 6m29.7437865s
Test Suite Passed

Checklist:

  • [x] My change requires a change to the documentation.
  • [x] I have updated the documentation accordingly.
  • [x] I've read the CONTRIBUTION guide
  • [x] I have added tests to cover my changes.
  • [x] All new and existing tests passed.

I did not update docs/e2e-tests.md.

knbnnate avatar Jun 14 '22 16:06 knbnnate

CLA Signed

The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: knbnnate / name: Nate Campbell (37d96fc37b422087b20aaf89db4601ae41de1196, cbc7ee2c89659658bffb9f9f951fff27102ebca2)

@knbnnate: This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Jun 14 '22 16:06 k8s-ci-robot

Welcome @knbnnate!

It looks like this is your first PR to kubernetes/ingress-nginx 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/ingress-nginx has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

k8s-ci-robot avatar Jun 14 '22 16:06 k8s-ci-robot

Hi @knbnnate. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Jun 14 '22 16:06 k8s-ci-robot

@knbnnate Thanks for the contribution.

The description seems too succint and likely requires lot of awareness of nginx logging and such.

To make things better, could you ;

  • Create an issue that describes the current problem with logging, in detail
  • Link the issue here
  • Provide some context of use case of escaping special characters here
  • Provide examples of "none" keyword from a installation of the controller, that uses your fork as the manifests for installation of the controller
  • Show some curl/other http requests and resulting logs, related to this feature
  • Provide as much detail here as possible, so that the feature is well documented here at least

longwuyuan avatar Jun 14 '22 17:06 longwuyuan

Please sign the CLA

longwuyuan avatar Jun 14 '22 17:06 longwuyuan

Please sign the CLA

Will do, submitted to corporate and waiting to hear back.

knbnnate avatar Jun 14 '22 17:06 knbnnate

@knbnnate Thanks for the contribution.

The description seems too succint and likely requires lot of awareness of nginx logging and such.

To make things better, could you ;

  • Create an issue that describes the current problem with logging, in detail
  • Link the issue here
  • Provide some context of use case of escaping special characters here
  • Provide examples of "none" keyword from a installation of the controller, that uses your fork as the manifests for installation of the controller
  • Show some curl/other http requests and resulting logs, related to this feature
  • Provide as much detail here as possible, so that the feature is well documented here at least

Created https://github.com/kubernetes/ingress-nginx/issues/8693 to describe the current problem with logging in detail. The specific use case that I'm interested in right now is logging to Azure Log Analytics and being able to query individual headers in the logs without writing an elaborate extract_json call to turn the embedded json string into json.

In the e2e test I wrote for this implementation, the "none" keyword was used in the controller install and successfully generated logs with the desired format (without escapes on the double-quote characters).

This is a very straightforward change to expose an nginx parameter that is currently not available through the ingress controller configuration. I can provide more detail if needed; I haven't provided the requested logs from curl requests and so forth because the exact desired behavior is specified in the end-to-end tests and is validated by the end-to-end tests, but if it's really necessary I will attach examples of this.

knbnnate avatar Jun 14 '22 18:06 knbnnate

I think the desired-log part showed what the log will look like. I just requested that you paste a real live http request and corresponding logs, that would be visible on a cluster that has the controller installed from your fork, so that becomes explicit info that the changes work. Just to assist the developer who would review this. Copy pasting live state of related objects like the ConfigMap/values file shown in cat on Kubectl describe. Just helps avoid a reviewer try to reproduce etc

longwuyuan avatar Jun 14 '22 18:06 longwuyuan

I think the desired-log part showed what the log will look like. I just requested that you paste a real live http request and corresponding logs, that would be visible on a cluster that has the controller installed from your fork, so that becomes explicit info that the changes work. Just to assist the developer who would review this. Copy pasting live state of related objects like the ConfigMap/values file shown in cat on Kubectl describe. Just helps avoid a reviewer try to reproduce etc

Fair enough; I don't have a lot of experience contributing here so it'll take me a little bit to get that all set up, but my CLA probably won't be signed before that's done anyways. Working on that now.

knbnnate avatar Jun 14 '22 18:06 knbnnate

I think taking your time is not a problem. Thank you very much for helping out with details. There is lack of resources and there is highest-priority-work in progress so this kind of feature request's review will take some time.

Once again thanks for the contribution.

longwuyuan avatar Jun 14 '22 18:06 longwuyuan

This link will help you quick-start on running the controller from your fork or a branch on your fork https://kubernetes.github.io/ingress-nginx/developer-guide/getting-started/#local-build

longwuyuan avatar Jun 14 '22 18:06 longwuyuan

This link will help you quick-start on running the controller from your fork or a branch on your fork https://kubernetes.github.io/ingress-nginx/developer-guide/getting-started/#local-build

Awesome, thanks!

knbnnate avatar Jun 14 '22 18:06 knbnnate

I've generated some real log output with this build. The setup is:

  • Kind running a local 1.24 cluster
  • An ubuntu pod with apache2 installed and started serving up the default page
  • I don't have my kind networking set up correctly so I generated a request with kubectl exec ingress-nginx-controller-6c498587c7-k6bnf -- curl localhost

Configmap:

nate@DESKTOP:/.../ingress-nginx/charts$ kubectl get configmap ingress-nginx-controller -o yaml
apiVersion: v1
data:
  allow-snippet-annotations: "true"
  location-snippet: |
    set $request_headers '';
    set_by_lua_block $request_headers{
      local cjson = require("cjson")
      local headers = ngx.req.get_headers()
      return cjson.encode(headers)
    }
  log-format-escape-none: "true"
  log-format-upstream: '{"host":"$host","request":"$request","request_headers":$request_headers}'
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: ingress-nginx
    meta.helm.sh/release-namespace: default
  creationTimestamp: "2022-06-15T15:47:03Z"
  labels:
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
    app.kubernetes.io/version: 1.2.1
    helm.sh/chart: ingress-nginx-4.1.4
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:data:
        .: {}
        f:allow-snippet-annotations: {}
        f:location-snippet: {}
        f:log-format-escape-none: {}
        f:log-format-upstream: {}
      f:metadata:
        f:annotations:
          .: {}
          f:meta.helm.sh/release-name: {}
          f:meta.helm.sh/release-namespace: {}
        f:labels:
          .: {}
          f:app.kubernetes.io/component: {}
          f:app.kubernetes.io/instance: {}
          f:app.kubernetes.io/managed-by: {}
          f:app.kubernetes.io/name: {}
          f:app.kubernetes.io/part-of: {}
          f:app.kubernetes.io/version: {}
          f:helm.sh/chart: {}
    manager: helm
    operation: Update
    time: "2022-06-15T15:47:03Z"
  name: ingress-nginx-controller
  namespace: default
  resourceVersion: "2741"
  uid: bbb98658-e4a2-4aec-818a-bded7f7d1a78

Helm values:

$ cat ~/ingress-nginx-values.yaml
imagePullSecrets:
  - name: imagepullsecret
controller:
  name: controller
  image:
    repository: myregistryurl.io/ingress-nginx/controller
    tag: "v1.2.1"
    digest: sha256:3f0580940210ad960d7c112add4f0636e46431e38e11971feb185a875a1acf8c
  config:
    location-snippet: |
      set $request_headers '';
      set_by_lua_block $request_headers{
        local cjson = require("cjson")
        local headers = ngx.req.get_headers()
        return cjson.encode(headers)
      }
    log-format-escape-none: true
    log-format-upstream: '{"host":"$host","request":"$request","request_headers":$request_headers}'

nginx.conf line generated by those helm values for logging:

$ kubectl exec ingress-nginx-controller-6c498587c7-k6bnf -- grep 'log_format upstreaminfo' nginx.conf
        log_format upstreaminfo escape=none '{"host":"$host","request":"$request","request_headers":$request_headers}';

Ancillary setup stuff to get an ubuntu pod running apache with an ingress:

---
apiVersion: v1
kind: Pod
metadata:
  name: ubuntu
  labels:
    app: ubuntu
spec:
  containers:
    - name: ubuntu
      image: ubuntu:latest
      command: ["/bin/sleep", "3650d"]
      imagePullPolicy: IfNotPresent
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ubuntu
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
  - http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: ubuntu
            port:
              number: 80
---
apiVersion: v1
kind: Service
metadata:
  name: ubuntu
  labels:
    app: ubuntu
spec:
  type: ClusterIP
  ports:
    - port: 80
      protocol: TCP
      targetPort: 80
  selector:
    app: ubuntu

Execed into the ubuntu pod and ran:

$ kubectl exec -it ubuntu -- /bin/bash
root@ubuntu:/# history
    1  apt-get install apache
    2  apt update
    3  apt-get install apache
    4  apt-get install apache-httpd
    5  apt-get install httpd
    6  apt-get install apache2
    7  curl localhost
    8  apt-get install -y curl
    9  curl localhost
   10  vim /etc/apache2/apache2.conf
   11  apt-get install -y vim
   12  vim /etc/apache2/apache2.conf
   13  ls /etc/apache2/*-enabled
   14  vim /etc/apache2/sites-available/000-default.conf
   15  systemctl status -l httpd
   16  service apache2 status
   17  service apache2 start
   18  curl localhost
   19  history

(The important bits being apt update; apt-get install apache2; service apache2 start)

Generated a few requests such as this:

$ kubectl exec ingress-nginx-controller-6c498587c7-k6bnf -- curl -v -s -S localhost -H "foo: bar" -H "json: {\"baz\": 3}" | head
*   Trying 127.0.0.1:80...
* Connected to localhost (127.0.0.1) port 80 (#0)
> GET / HTTP/1.1
> Host: localhost
> User-Agent: curl/7.79.1
> Accept: */*
> foo: bar
> json: {"baz": 3}
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Wed, 15 Jun 2022 16:53:56 GMT
< Content-Type: text/html
< Content-Length: 10671
< Connection: keep-alive
< Last-Modified: Wed, 15 Jun 2022 16:33:59 GMT
< ETag: "29af-5e17f169a4545"
< Accept-Ranges: bytes
< Vary: Accept-Encoding
<
{ [3841 bytes data]
* Connection #0 to host localhost left intact
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <!--
    Modified from the Debian original for Ubuntu
    Last updated: 2022-03-22
    See: https://launchpad.net/bugs/1966004
  -->
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title>Apache2 Ubuntu Default Page: It works</title>

Logs:

$ kubectl logs ingress-nginx-controller-6c498587c7-k6bnf
-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       v1.2.1
  Build:         git-529443e92
  Repository:    https://github.com/knbnnate/ingress-nginx.git
  nginx version: nginx/1.19.10

-------------------------------------------------------------------------------

W0615 16:08:40.836334      10 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0615 16:08:40.836418      10 main.go:230] "Creating API client" host="https://10.96.0.1:443"
I0615 16:08:40.847025      10 main.go:274] "Running in Kubernetes cluster" major="1" minor="24" git="v1.24.0" state="clean" commit="4ce5a8954017644c5420bae81d72b09b735c21f0" platform="linux/amd64"
I0615 16:08:40.923081      10 main.go:104] "SSL fake certificate created" file="/etc/ingress-controller/ssl/default-fake-certificate.pem"
I0615 16:08:40.931147      10 ssl.go:531] "loading tls certificate" path="/usr/local/certificates/cert" key="/usr/local/certificates/key"
I0615 16:08:40.939628      10 nginx.go:256] "Starting NGINX Ingress controller"
I0615 16:08:40.942725      10 event.go:285] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"default", Name:"ingress-nginx-controller", UID:"bbb98658-e4a2-4aec-818a-bded7f7d1a78", APIVersion:"v1", ResourceVersion:"2741", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap default/ingress-nginx-controller
I0615 16:08:42.141803      10 nginx.go:299] "Starting NGINX process"
I0615 16:08:42.141937      10 leaderelection.go:248] attempting to acquire leader lease default/ingress-controller-leader...
I0615 16:08:42.142120      10 nginx.go:319] "Starting validation webhook" address=":8443" certPath="/usr/local/certificates/cert" keyPath="/usr/local/certificates/key"
I0615 16:08:42.142231      10 controller.go:167] "Configuration changes detected, backend reload required"
I0615 16:08:42.146564      10 leaderelection.go:258] successfully acquired lease default/ingress-controller-leader
I0615 16:08:42.146588      10 status.go:84] "New leader elected" identity="ingress-nginx-controller-6c498587c7-k6bnf"
I0615 16:08:42.170820      10 controller.go:184] "Backend successfully reloaded"
I0615 16:08:42.170891      10 controller.go:195] "Initial sync, sleeping for 1 second"
I0615 16:08:42.170952      10 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"ingress-nginx-controller-6c498587c7-k6bnf", UID:"b4c34046-32d2-4546-9062-32af01b694a7", APIVersion:"v1", ResourceVersion:"4908", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration
I0615 16:41:27.138166      10 admission.go:149] processed ingress via admission controller {testedIngressLength:1 testedIngressTime:0.016s renderingIngressLength:1 renderingIngressTime:0s admissionTime:14.3kBs testedConfigurationSize:0.016}
I0615 16:41:27.138202      10 main.go:100] "successfully validated configuration, accepting" ingress="default/ubuntu"
I0615 16:41:27.141403      10 store.go:429] "Found valid IngressClass" ingress="default/ubuntu" ingressclass="nginx"
I0615 16:41:27.141653      10 controller.go:167] "Configuration changes detected, backend reload required"
I0615 16:41:27.141670      10 event.go:285] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"ubuntu", UID:"a3bf6563-716f-4419-ba32-d72da26cd575", APIVersion:"networking.k8s.io/v1", ResourceVersion:"7701", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync
I0615 16:41:27.167515      10 controller.go:184] "Backend successfully reloaded"
I0615 16:41:27.167763      10 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"ingress-nginx-controller-6c498587c7-k6bnf", UID:"b4c34046-32d2-4546-9062-32af01b694a7", APIVersion:"v1", ResourceVersion:"4908", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration
{"host":"localhost","request":"GET / HTTP/1.1","request_headers":{"user-agent":"curl\/7.79.1","host":"localhost","accept":"*\/*"}}
{"host":"localhost","request":"GET / HTTP/1.1","request_headers":{"user-agent":"curl\/7.79.1","host":"localhost","accept":"*\/*","json":"{\"baz\": 3}","foo":"bar"}}
{"host":"localhost","request":"GET / HTTP/1.1","request_headers":{"user-agent":"curl\/7.79.1","host":"localhost","accept":"*\/*","foo":"bar","json":"{\"baz\": 3}"}}

The key bit being at the end there:

{"host":"localhost","request":"GET / HTTP/1.1","request_headers":{"user-agent":"curl\/7.79.1","host":"localhost","accept":"*\/*"}}
{"host":"localhost","request":"GET / HTTP/1.1","request_headers":{"user-agent":"curl\/7.79.1","host":"localhost","accept":"*\/*","json":"{\"baz\": 3}","foo":"bar"}}
{"host":"localhost","request":"GET / HTTP/1.1","request_headers":{"user-agent":"curl\/7.79.1","host":"localhost","accept":"*\/*","foo":"bar","json":"{\"baz\": 3}"}}

If we break one of these down with jq as I'll demonstrate below, you can see that "request_headers" is a json object, and if any of the headers are in turn composed of json, those headers are escaped by the cjson encode function in the lua script that creates the request_headers object, which is the desired behavior - we wouldn't want characters injected in a header to break the json formatting that is intended in the lua script.

$ logstring='{"host":"localhost","request":"GET / HTTP/1.1","request_headers":{"user-agent":"curl\/7.79.1","host":"localhost","accept":"*\/*","foo":"bar","json":"{\"baz\": 3}"}}'
$ echo "${logstring}" | jq .
{
  "host": "localhost",
  "request": "GET / HTTP/1.1",
  "request_headers": {
    "user-agent": "curl/7.79.1",
    "host": "localhost",
    "accept": "*/*",
    "foo": "bar",
    "json": "{\"baz\": 3}"
  }
}
$ echo "${logstring}" | jq '.request_headers.json | fromjson'
{
  "baz": 3
}

The behavior that would have been reflected by the old log-format-escape-json setting (with quotes around the $request_header object so the log is still valid json) would look like this:

$ logstring='{"host":"localhost","request":"GET / HTTP/1.1","request_headers":"{\"user-agent\":\"curl\/7.79.1\",\"host\":\"localhost\",\"accept\":\"*\/*\",\"foo\":\"bar\",\"json\":\"{\\\"baz\\\": 3}\"}"}'
$ echo "${logstring}" | jq '.'
{
  "host": "localhost",
  "request": "GET / HTTP/1.1",
  "request_headers": "{\"user-agent\":\"curl/7.79.1\",\"host\":\"localhost\",\"accept\":\"*/*\",\"foo\":\"bar\",\"json\":\"{\\\"baz\\\": 3}\"}"
}
$ echo "${logstring}" | jq '.request_headers | fromjson'
{
  "user-agent": "curl/7.79.1",
  "host": "localhost",
  "accept": "*/*",
  "foo": "bar",
  "json": "{\"baz\": 3}"
}
$ echo "${logstring}" | jq '.request_headers | fromjson | .json | fromjson'
{
  "baz": 3
}

The components are still all there with the log-format-escape-json behavior if you can pass them individually through a json parser enough times, but with log-format-escape-none and handling all of the escaping in the cjson.encode call, multiple passes of json parsing on the components of the log aren't required to break it all down.

knbnnate avatar Jun 15 '22 16:06 knbnnate

@knbnnate I see slashes in the logs in this latest post here so is that a controller installed without your code change ?

In case it is relevant, the idea is to ;

  • go to your fork and the branch
  • ensure your code changes are visible in your branch
  • now run make dev-env while you are using your changes branch. This make dev-env, from your code change branch builds and installs the controller in a kind cluster with your code changes
  • deploy kubectl create deploy test0 --image nginx:alpine --port 80 && kubectl expose deploy test80 --port 80 && kubectl create ingress --class nginx --rule www.dev.knbnnate.com/*=test0:80
  • curl --resolve www.dev.knbnnate.com:80:127.0.0.1 http://www.dev.knbnnate.com -v
  • Check logs for format as per your code change

longwuyuan avatar Jun 15 '22 17:06 longwuyuan

@knbnnate I see slashes in the logs in this latest post here so is that a controller installed without your code change ?

In case it is relevant, the idea is to ;

  • go to your fork and the branch
  • ensure your code changes are visible in your branch
  • now run make dev-env while you are using your changes branch. This make dev-env, from your code change branch builds and installs the controller in a kind cluster with your code changes
  • deploy kubectl create deploy test0 --image nginx:alpine --port 80 && kubectl expose deploy test80 --port 80 && kubectl create ingress --class nginx --rule www.dev.knbnnate.com/*=test0:80
  • curl --resolve www.dev.knbnnate.com:80:127.0.0.1 http://www.dev.knbnnate.com -v
  • Check logs for format as per your code change

I'll edit my comment to clarify - this is reflecting the intended behavior, including if the correct json request_headers object contains a header containing more json inside it.

knbnnate avatar Jun 15 '22 18:06 knbnnate

ok. Partially understood. I thunk we just wait for developers to review this now. What is fromjson ? Am not a developer so never used it.

/ok-to-test

longwuyuan avatar Jun 16 '22 02:06 longwuyuan

ok. Partially understood. I thunk we just wait for developers to review this now. What is fromjson ? Am not a developer so never used it.

/ok-to-test

jq being a command-line utility for parsing and filtering json, fromjson is a filter within jq that basically just reapplies jq. So if you've pulled some escaped json out of the value of a json key using jq, you can then parse it without having to pass it to another invocation of jq.

knbnnate avatar Jun 16 '22 10:06 knbnnate

thanks for elaborating on fromjson

longwuyuan avatar Jun 16 '22 14:06 longwuyuan

/lgtm

strongjz avatar Sep 30 '22 14:09 strongjz

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: knbnnate, strongjz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Sep 30 '22 14:09 k8s-ci-robot