dns icon indicating copy to clipboard operation
dns copied to clipboard

bump coredns version to fix 404 health check errors

Open xh4n3 opened this issue 3 years ago • 15 comments

node-cache is using CoreDNS 1.7.0 which produces 404 health check errors, see #381 bumping CoreDNS to 1.8.1 should fix it.

xh4n3 avatar Sep 28 '22 07:09 xh4n3

Welcome @xh4n3!

It looks like this is your first PR to kubernetes/dns 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/dns has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

k8s-ci-robot avatar Sep 28 '22 07:09 k8s-ci-robot

Hi @xh4n3. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Sep 28 '22 07:09 k8s-ci-robot

/assign @dpasiukevich

MrHohn avatar Sep 28 '22 16:09 MrHohn

Thanks for the PR!

Let me reevaluate the issue https://github.com/kubernetes/dns/issues/476 to see if we could bump coredns version without any regression. Also I will check if it's better to update to newer coredns (1.9.4 or 1.10.0)

dpasiukevich avatar Oct 04 '22 12:10 dpasiukevich

/ok-to-test

dpasiukevich avatar Oct 10 '22 09:10 dpasiukevich

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: xh4n3 Once this PR has been reviewed and has the lgtm label, please ask for approval from dpasiukevich by writing /assign @dpasiukevich in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Oct 10 '22 10:10 k8s-ci-robot

On the first look it appears because of the pinned version for skydns (used by kubedns) we cannot upgrade to CoreDNS to higher version than 1.8.4 Unless skydns repo is forked and the codebase and its dependencies are upgraded. So having coredns 1.9.4 or 1.10.0 is not possible atm.

So let's have coredns 1.8.1 as in this PR. Just let me check the performance tests if it's safe to upgrade and https://github.com/kubernetes/dns/issues/476 is still valid.

dpasiukevich avatar Oct 11 '22 09:10 dpasiukevich

we cannot upgrade to CoreDNS to higher version than 1.8.4

@dpasiukevich is it because the newer CoreDNS is relying on some vendors which conflict with skydns'?

xh4n3 avatar Oct 11 '22 10:10 xh4n3

Yeah, there are some shared transitive dependencies to grpc client. At least I found transitive dependency collision for google.golang.org/grpc/naming. Other details are described here https://github.com/kubernetes/dns/issues/505

  1. Could I ask you to check if that's possible to upgrade to CoreDNS 1.9.4 or 1.10.0? Maybe I overlooked something and it's possible to do it effortlessly. (as I can see right now - skydns repo has to be forked and its codebase+dependecies updated)
  2. Meanwhile I will run scalability tests on this PR to see if there's a cache miss regression https://github.com/kubernetes/dns/issues/476)

dpasiukevich avatar Oct 11 '22 10:10 dpasiukevich

I was trying to pin grpc to v1.29.1 to solve the missing naming package issue. But it turns out that since v1.9.1, CoreDNS requires gRPC-Go v1.32.0 or later, see https://github.com/coredns/coredns/blob/v1.9.1/pb/dns_grpc.pb.go#L18-L19 So the conflict is inevitably happening here.

The solution can be:

  1. Stick with CoreDNS older then v1.9.1
  2. Eliminate the code that using github.com/coreos/etcd/client package in the k8s.io/dns/pkg/dns, as I found the code are just using the etcd Error types from the vendor.

I'd like to go with the second solution, how do you think? @dpasiukevich

xh4n3 avatar Oct 12 '22 11:10 xh4n3

Regarding the

Eliminate the code that using github.com/coreos/etcd/client package in the k8s.io/dns/pkg/dns, as I found the code are just using the etcd Error types from the vendor.

I checked even if we upgrade the k8s.io/dns/pkg/dns to a newer version of etcd client, there's still: cmd/kube-dns/app/server -> github.com/skynetservices/skydns/server -> github.com/coreos/etcd/client

So we'd have to either fork or fix the dependencies in the skydns package.

The main problem is that kubernetes/dns repo hosts 2 binaries (on the high level) - kubedns and nodelocaldns. kubedns is virtually deprecated but its pinned dependencies cause us the trouble to upgrade nodelocaldns dependencies.

I'd suggest in the scope of this PR to just have maximum possible CoreDNS version (would be great if it's possible to build with coredns 1.9.0).

dpasiukevich avatar Oct 12 '22 15:10 dpasiukevich

You could also run make images-clean; make build-amd64; make containers-amd64; make test locally to check the build

dpasiukevich avatar Oct 13 '22 06:10 dpasiukevich

@dpasiukevich Thanks for the tip. There are a lot vendor conflicts, so I did a thorough clean up to the go.mod. Please help me check if any other issues. Maybe we should host the deprecated kube-dns to another repository in the future.

xh4n3 avatar Oct 13 '22 08:10 xh4n3

Cool, thanks! I also found that coredns 1.8.4 is the max version we can have right now without fixing dependency chain.

Right now I only need to run scalability tests to verify if https://github.com/kubernetes/dns/issues/476 is reproducible and has an impact currently. The only thing is that running the scalability tests with the custom built image is a bit painful.

dpasiukevich avatar Oct 13 '22 15:10 dpasiukevich

Hi, I tried to run performance tests and verify there's no scalability regression mentioned in https://github.com/kubernetes/dns/issues/476 but couldn't find time before going on vac for 2 weeks.

I've asked my teammates to take a look if there's capacity or I will check it when I return.

dpasiukevich avatar Oct 14 '22 14:10 dpasiukevich

I imported skydns, upgarded its dependences, which allowed to upgrade to the latest coredns 1.10.0 in https://github.com/kubernetes/dns/pull/551

dpasiukevich avatar Nov 09 '22 07:11 dpasiukevich