ingress-nginx icon indicating copy to clipboard operation
ingress-nginx copied to clipboard

fix options ndots not in the second position

Open xkos opened this issue 3 years ago • 14 comments

What this PR does / why we need it:

with add other options, ndots may not in the second position, this may cause performance problems when requesting simplified internal SVC addresses this PR fixes this problem

Types of changes

  • [x] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Documentation only

Which issue/s this PR fixes

How Has This Been Tested?

Checklist:

  • [ ] My change requires a change to the documentation.
  • [ ] I have updated the documentation accordingly.
  • [x] I've read the CONTRIBUTION guide
  • [ ] I have added tests to cover my changes.
  • [ ] All new and existing tests passed.

xkos avatar May 17 '22 05:05 xkos

CLA Signed

The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: xkos / name: lilin (bc18682ce14b89f148bcdec659e728743fedf607)

@xkos: This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar May 17 '22 05:05 k8s-ci-robot

Welcome @xkos!

It looks like this is your first PR to kubernetes/ingress-nginx 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/ingress-nginx has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

k8s-ci-robot avatar May 17 '22 05:05 k8s-ci-robot

Hi @xkos. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar May 17 '22 05:05 k8s-ci-robot

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: xkos To complete the pull request process, please assign rikatz after the PR has been reviewed. You can assign the PR to them by writing /assign @rikatz in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar May 17 '22 05:05 k8s-ci-robot

Hi @xkos ,

Thank you for your contribution. The default configuration of nodes is dictated upstream and not usually changed by the ingress-nginx-controller.

It looks like your change could solve a specific use-case that you may have experienced but you yourself are only saying "may" so there is no clear data here, that proves beyond doubt that this change will be a improvement overall.

I myself am not in facet of this change. But if you can post data like the entire configuration and tests, as commands and their outputs and numeric comparison of states and tests with and also without this change, it will be worthwhile to study the data.

Lets see what others think but a BIG NO from me for this change, in its current state, as it is related to something as critical a s DNS, that could negatively impact users.

longwuyuan avatar May 17 '22 05:05 longwuyuan

I'm also not sure what this really fixes. Can you provide more information on the before/after, where you get the data from performance, etc?

Thanks

rikatz avatar May 18 '22 15:05 rikatz

Hi @xkos ,

Thank you for your contribution. The default configuration of nodes is dictated upstream and not usually changed by the ingress-nginx-controller.

It looks like your change could solve a specific use-case that you may have experienced but you yourself are only saying "may" so there is no clear data here, that proves beyond doubt that this change will be a improvement overall.

I myself am not in facet of this change. But if you can post data like the entire configuration and tests, as commands and their outputs and numeric comparison of states and tests with and also without this change, it will be worthwhile to study the data.

Lets see what others think but a BIG NO from me for this change, in its current state, as it is related to something as critical a s DNS, that could negatively impact users.

example:

      dnsConfig:
        options:
        - name: single-request-reopen
        - name: timeout
          value: "1"
        - name: ndots
           value: "2"

if add more than one dnsConfig.options to workload but doesn't notice this order , it will parse ndots failed,ndots option will not take effect. dns resolv action will be inconsistent with expectations

xkos avatar May 19 '22 07:05 xkos

I still don't understand. I will wait and see if you can provide simple clear "end-to-end" example WITH data. Like show a cluster and is config, some workload and its config, some http requests and its details etc.

I can try to guess what you are saying but there are so many users who will be impacted by DNS so unless there is data in this PR that is clear simple and easy for everyone to understand, I am not sure what the next steps will be. Lets wait for other comments I guess.

longwuyuan avatar May 19 '22 07:05 longwuyuan

I still don't understand. I will wait and see if you can provide simple clear "end-to-end" example WITH data. Like show a cluster and is config, some workload and its config, some http requests and its details etc.

I can try to guess what you are saying but there are so many users who will be impacted by DNS so unless there is data in this PR that is clear simple and easy for everyone to understand, I am not sure what the next steps will be. Lets wait for other comments I guess.

an example:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
    app.kubernetes.io/version: 1.2.0
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  ...
      dnsPolicy: ClusterFirst
      dnsConfig:
        options:
        - name: single-request-reopen
        - name: timeout
          value: "1"
        - name: ndots
          value: "2"
      ...

this deploy use multi options in dnsConfig and this is container's resolv.conf:

nameserver xx.xx.xx.xx
search NS.svc.cluster.local svc.cluster.local cluster.local
options single-request-reopen timeout:1 ndots:2

with the dnsConfig, i want ndots to be 2,but in resolv_conf.lua, it's value still 1

local nameservers, search, ndots = {}, {}, 1
...
local function set_ndots(parts)
  local option = parts[2] -- (the second field is not ndot in this dnsConfig.options)
  if not option then
    return
  end

  local option_parts, err = ngx_re_split(option, ":")
  if err then
    ngx_log(ngx_ERR, err)
    return
  end

  if option_parts[1] ~= "ndots" then
    return
  end

  ndots = tonumber(option_parts[2])
end

with dns.lua

  -- for non fully qualified domains if number of dots in
  -- the queried host is less than resolv_conf.ndots then we try
  -- with all the entries in resolv_conf.search before trying the original host
  --
  -- if number of dots is not less than resolv_conf.ndots then we start with
  -- the original host and then try entries in resolv_conf.search
  local _, host_ndots = host:gsub("%.", "")
  local search_start, search_end = 0, #resolv_conf.search
  if host_ndots < resolv_conf.ndots then
    search_start = 1
    search_end = #resolv_conf.search + 1
  end

when ingress nginx resolve a svc, such demo-svc.demo, it will request demo-svc.demo first, but this is not a valid domain, this request must be no such name, it will request recursively until a.root-server.net, and cost lot's of time (it took 100ms in my env)

xkos avatar May 24 '22 08:05 xkos

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
    app.kubernetes.io/version: 1.2.0
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  ...
      dnsPolicy: ClusterFirst
      dnsConfig:
        options:
        - name: single-request-reopen
        - name: timeout
          value: "1"
        - name: ndots
          value: "2"
      ...

I don't see any connection between this yaml and resolved_conf.lua . Can you explain. So far nothing makes sense in the way it has been described here.

longwuyuan avatar May 24 '22 08:05 longwuyuan

dnsConfig control /etc/resolv.conf in container, right?

when set dnsConfig opitons, resolv.conf's content will be modify to this

nameserver xx.xx.xx.xx
search NS.svc.cluster.local svc.cluster.local cluster.local
options single-request-reopen timeout:1 ndots:2 (this line modified, bacause of dnsConfig in deployment yaml)

resolv_conf.lua parse this file, when ndot:2 not in the second field, resolv_conf.lua set wrong ndot value

xkos avatar May 24 '22 12:05 xkos

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Aug 22 '22 13:08 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Sep 21 '22 13:09 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-triage-robot avatar Oct 21 '22 14:10 k8s-triage-robot

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Oct 21 '22 14:10 k8s-ci-robot