external-dns icon indicating copy to clipboard operation
external-dns copied to clipboard

Add NLB IP dualstack mode support (ipv6 support for NLB)

Open logingood opened this issue 3 years ago • 21 comments

Description

AWS Loadbalancer Ingress controller supports service resource of LoadBalancer type. Currently dualstack annotation for service is ignored by external dns and corresponding AAAA record is not being created. This PR adds support of ipv6 similar to ingress resource/ALB.

Fixes #ISSUE

Checklist

  • [x] Unit tests updated
  • [x] End user documentation updated

logingood avatar Apr 15 '21 04:04 logingood

Welcome @logingood!

It looks like this is your first PR to kubernetes-sigs/external-dns 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/external-dns has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

k8s-ci-robot avatar Apr 15 '21 04:04 k8s-ci-robot

/kind feature

seanmalloy avatar Apr 15 '21 05:04 seanmalloy

Added a few comments. Did you test this @logingood? I don't think this is going to be enough to create AAAA records 🤔

I'll test it today, I was relying on this piece of code, which expects the annotation to be present https://github.com/kubernetes-sigs/external-dns/blob/a5baad26d7f4a7e1ff9ff2fefa6d566fe726a1cd/provider/aws/aws.go#L558-L568 and here's the test that validates this condition https://github.com/kubernetes-sigs/external-dns/blob/a5baad26d7f4a7e1ff9ff2fefa6d566fe726a1cd/provider/aws/aws_test.go#L932

logingood avatar Apr 15 '21 21:04 logingood

Added a few comments. Did you test this @logingood? I don't think this is going to be enough to create AAAA records 🤔

I'll test it today, I was relying on this piece of code, which expects the annotation to be present

https://github.com/kubernetes-sigs/external-dns/blob/a5baad26d7f4a7e1ff9ff2fefa6d566fe726a1cd/provider/aws/aws.go#L558-L568 and here's the test that validates this condition

https://github.com/kubernetes-sigs/external-dns/blob/a5baad26d7f4a7e1ff9ff2fefa6d566fe726a1cd/provider/aws/aws_test.go#L932

@Raffo I've just tested, it worked fine

image image

logingood avatar Apr 15 '21 22:04 logingood

@Raffo @seanmalloy any feedback on that ? We are actively using forked version and it perfectly creates AAAA for nlb-ip load balancers. It would be very useful to fix this behavior.

logingood avatar May 19 '21 22:05 logingood

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Aug 17 '21 23:08 k8s-triage-robot

Any movement on this?

crawforde avatar Sep 01 '21 23:09 crawforde

hey is there any update on that ? @Raffo @seanmalloy is there any alternatives to that ?

logingood avatar Sep 01 '21 23:09 logingood

/remove-lifecycle stale

crawforde avatar Sep 02 '21 00:09 crawforde

/assign @njuettner

crawforde avatar Sep 02 '21 00:09 crawforde

Bump please?

krish7919 avatar Oct 08 '21 13:10 krish7919

How does this one compare to #2309?

forsberg avatar Oct 15 '21 04:10 forsberg

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 13 '22 05:01 k8s-triage-robot

How does this one compare to #2309?

this PR covers IPv6 for NLB #2309 might be more generic ?

logingood avatar Jan 14 '22 22:01 logingood

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Mar 14 '22 13:03 k8s-triage-robot

/remove-lifecycle rotten

crawforde avatar Mar 14 '22 23:03 crawforde

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jun 12 '22 23:06 k8s-triage-robot

/remove-lifecycle stale

crawforde avatar Jun 22 '22 18:06 crawforde

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: crawforde, logingood, mesge, sylr To complete the pull request process, please ask for approval from njuettner after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Jul 19 '22 11:07 k8s-ci-robot

For those interested I've built a docker image with this PR:

source: https://github.com/sylr/external-dns/commits/v0.12.0%2Bsylr image: ghcr.io/sylr/external-dns:v0.12.0-2-gca826a18

In my case I had to remove the TXT record and A record of the services which use dualstack for external-dns to create both A and AAAA records at the next run.

sylr avatar Jul 19 '22 12:07 sylr

Anyone know the status on this @Raffo @seanmalloy @njuettner?

Thanks @sylr.

cdobbyn avatar Aug 08 '22 22:08 cdobbyn

New custom build with this patch

source: https://github.com/sylr/external-dns/commits/v0.12%2Bsylr image: ghcr.io/sylr/external-dns:v0.12.2-3-g264cde1a

sylr avatar Oct 06 '22 13:10 sylr

@Raffo @seanmalloy @njuettner I've been running this patch for 6 months now, can we merge it please ?

sylr avatar Jan 31 '23 13:01 sylr

Not sure why but adding this to our service while using your latest image didn't work @sylr. The aws load balancer controller did update it appropriately to be dualstack to the outside world.

service.beta.kubernetes.io/aws-load-balancer-ip-address-type: "dualstack"

ingress-nginx-service.yml.txt

Would love to see this working and then merged into mainline.

cdobbyn avatar Apr 11 '23 23:04 cdobbyn

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: crawforde, logingood, mesge, sylr Once this PR has been reviewed and has the lgtm label, please ask for approval from njuettner. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Apr 16 '23 23:04 k8s-ci-robot

I've rebased just in case if it remains relevant, this code have been running without issues more than a year. If someone has a better way/more generic to implement this that'll be great!

logingood avatar Apr 16 '23 23:04 logingood

I just independently implemented this after needing the missing functionality and realizing external-dns did not support it. I searched issues before I implemented it but I only searched PRs as I went to open my own.

This is seemingly a perfectly reasonable PR, following the same approach as is used to handle the ALB dual stack annotation, and it's been open for over 2 years now. Multiple people have commented that it works and is useful for them. I don't understand the reticence to merge. If there is a problem with this approach, can you please provide guidance on a different approach which would be acceptable?

The issue at hand is caused by a specific quirk in Route53 ALIAS records, where both A and AAAA alias records need to be created for certain load balancers. The Endpoint domain model is currently modeling these as CNAMEs. If you run external-dns with --aws-prefer-cname things general work as expected, since A and AAAA queries will resolve through the CNAME. But when the AWS provider is converting these entries into ALIAS A records it doesn't create the required AAAA record.

Some alternative options:

  1. Change the AWS provider to always create AAAA alias records as well. AFAICT, this does not change query response behavior. In particular, a AAAA query of a AAAA ALIAS record pointing to an NLB that does not itself have an AAAA record will return a 0-answer NOERROR response. You get the same response if there is no AAAA ALIAS record at all. I'm not sure if this would have scale or backwards compatibility concerns...

  2. Create a non-provider specific annotation which helps external-dns know to create the AAAA record. It will look pretty similar to the above, but maybe it will meet your requirements. It should only apply to ALIAS-type records, and I don't know if any other DNS providers have such things. Something like placing the following annotation on the relevant Service: external-dns.alpha.kubernetes.io/create-dualstack-alias-records: "true" or external-dns.alpha.kubernetes.io/alias-record-ip-address-type: "ipv6"|"ipv4"|"dualstack" or external-dns.alpha.kubernetes.io/alias-record-types: "a"|"aaaa"|"a, aaaa" or something...

I would really love to see some solution here. It's a useful feature and it's surprising missing behavior when someone runs across it.

reltuk avatar Apr 18 '23 20:04 reltuk

A concern I have with this approach is that it is making assumptions about the behavior of the AWS Load Balancer Controller that aren't always going to be true. It's already possible to configure LBC to create dualstack ALBs by specifying that in the IngressClassParams instead of in an annotation. It's possible LBC could in the future be made configurable to create dual-stack NLBs by default, particularly since I plan to submit such a feature.

Is there a problem with always creating both A and AAAA alias records?

johngmyers avatar May 07 '23 21:05 johngmyers

@johngmyers I wonder why we need this at all, because if ALIAS would return A/AAAA, then CNAME to the ALIAS would just work. Of course it would depend on the client that resolves to ask for AAAA instead of A.

szuecs avatar May 08 '23 11:05 szuecs

@johngmyers I wonder why we need this at all, because if ALIAS would return A/AAAA, then CNAME to the ALIAS would just work. Of course it would depend on the client that resolves to ask for AAAA instead of A.

We need this due a particular quirk of ALIAS records in Route53. In particular. ALIAS is not a record type, it's a type of record value, which allows a record entry to resolve to an existing other record in Route53. Whereas a CNAME is a record type which has specific resolution semantics that resolvers follow. So, the way to setup an AAAA record in Route53 using an ALIAS record value is to create the AAAA record. And if you want an A record, you have to create the A record.

As mentioned above, creating both records in all cases, regardless of whether the actual alias'd name has an AAAA record or not, is potentially fine. Questions remain around backward compatibility, any potential scale concerns arising from blindly doubling the number of records, etc.

AFAICT, the concerns around IngressClassParams do not apply to this PR. They apply to existing code in external-dns. It's true that any provider-specific code will always need to be updated to support new behavior and functionality in providers. It's true that the existing modeling of ALIAS's as CNAMEs in source and the use of an endpoint label to model dual-stack CNAMEs in source makes it somewhat awkward to keep provider specific logic out of source. Those don't seem like compelling reasons to leave external-dns broken for this use case. Certainly further development work in the future could improve the situation if it's something the external-dns developers feel warrants further investment.

reltuk avatar May 08 '23 18:05 reltuk

My proposal would be to remove endpoint.DualstackLabelKey, changing the AWS provider to act as if it was always present. That would probably keep all of the logic inside the AWS provider.

There might need to be some logic so that the provider would, possibly with the help of core, be able to reconcile discrepancies between the two alias records, such as the AAAA record not existing.

I can take a look at putting together such a PR if @logingood doesn't want dibs.

johngmyers avatar May 08 '23 19:05 johngmyers