external-dns use `port.Name` to populate srv records

Description

It is in the best intention of the user to name their services with a self-explanatory identifier and not with the canonical protocol specification.

Let's image we use the bitnami mongodb-sharded chart to create a database that we'll expose using srv records. The chart will create a service along the lines of:

apiVersion: v1
kind: Service
metadata:
  name: orders-db
  annotations:
    external-dns.alpha.kubernetes.io/hostname: orders.acme.com
spec:
    type: NodePort
    ports:
    - name: mongodb
      port: 27017
      targetPort: mongodb
    - name: metrics
      port: 9216
      targetPort: metrics

Not only the mongodb client will expect the following srv to connect _mongodb._tcp.orders.acme.com when the actual dns record that will be created is _orders-db._tcp.orders.acme.com but also this srv record will have 2 values:

_orders-db._tcp.orders.acme.com 300 IN SRV 0 50 30225 orders.acme.com
_orders-db._tcp.orders.acme.com 300 IN SRV 0 50 31971 orders.acme.com

It will be much better to get the fqdn using the port's name to construct the following 2 records:

_mongodb._tcp.orders.acme.com. 300 IN SRV 0 50 31971 orders.acme.com.
_metrics._tcp.orders.acme.com. 300 IN SRV 0 50 30225 orders.acme.com.

Fixes #4000

NOTE: As this is my first pr here and I don't have that much experience with go. i'll wait for a reply before changing the unit tests and run the whole suite

Checklist

[ ] Unit tests updated
[ ] End user documentation updated

Oct 22 '23 03:10 kerbaras

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign szuecs for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

Oct 22 '23 03:10 k8s-ci-robot

The committers listed above are authorized under a signed CLA.

:white_check_mark: login: kerbaras / name: Matias Pierobon (95cf92a9cdf3bab5c0d936d546853a6d4cc1d94c)

Oct 22 '23 03:10 linux-foundation-easycla[bot]

Welcome @kerbaras!

It looks like this is your first PR to kubernetes-sigs/external-dns 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/external-dns has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

Oct 22 '23 03:10 k8s-ci-robot

Hi @kerbaras. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Oct 22 '23 03:10 k8s-ci-robot

Mmmh, it's interesting but I'm not sure it makes sense :thinking: .

Using multiple ports on the same service, on a SRV record, is often used to for high availability. external-dns should not break that.

=> My understanding is that your issue should not be solved like this. It should be solved with another service on metrics.

Let me know if I missed something.

Oct 22 '23 10:10 mloiseleur

Using multiple ports on the same service, on a SRV record, is often used to for high availability. external-dns should not break that.

This is mainly to support different nodes with the same port mapped. Not for mapping multiple ports.

If we follow the example linked we can see the following:

; _service._proto.name.  TTL   class SRV priority weight port target.
_sip._tcp.example.com.   86400 IN    SRV 10       60     5060 bigbox.example.com.
_sip._tcp.example.com.   86400 IN    SRV 10       20     5060 smallbox1.example.com.
_sip._tcp.example.com.   86400 IN    SRV 10       20     5060 smallbox2.example.com.
_sip._tcp.example.com.   86400 IN    SRV 20       0      5060 backupbox.example.com.

where there are 4 services exported the same sip port. In this case, the port would be 5060 and the 4 nodes bigbox.example.com smallbox1.example.com smallbox2.example.com backupbox.example.com which is a different use case.

here we have more than 1 port exported in a kube service (like 27017 and 9216) which understands different protocols. Following SRV standards, 2 SRV records would be needed. I found it hard to find an example where the same service would export 2 ports that would understand the same protocol.

With the proposed example both cases are being supported. Multiple matches under the same port will produce a load-balancing SRV record and different protocol ports will cause different SRV records.

To me it seems like an obscure case to have the following:

; _service._proto.name.  TTL   class SRV priority weight port target.
_sip._tcp.example.com.   86400 IN    SRV 10       60     5060 bigbox.example.com.
_sip._tcp.example.com.   86400 IN    SRV 10       20     5061 bigbox.example.com.

as bigbox.example.com will resolve to the same IP address, therefore this suggests that the same machine is listening both in port 5060 and 5061 for the same use case. Which is not high availability nor a common scenario.

But maybe i understood you wrongly

Oct 23 '23 21:10 kerbaras

Ok. I see your point. The record format specify that the first field is supposed to be the service and in Kubernetes, the port name can be a ref to the service name.

I am like you on this:

I found it hard to find an example where the same service would export 2 ports that would understand the same protocol.

It's a breaking change, but why not ?

@Raffo @szuecs @johngmyers Wdyt about it ?

@kerbaras This PR will need tests and documentation update to be reviewed. Wdyt about adding a CLI arg to enable this behavior ? This would avoid this feature to be a breaking change.

Oct 24 '23 06:10 mloiseleur

Wdyt about adding a CLI arg to enable this

Sounds good. I'll get to it as soon as I get some free time :)

Oct 24 '23 13:10 kerbaras

We need a test that proofs that it works or we will break it in the future. Also better to add an option to enable this changed behavior and do not do breaking changes. This project is too big to have a breaking change in "service".

Nov 09 '23 17:11 szuecs

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Feb 07 '24 17:02 k8s-triage-robot

/remove-lifecycle stale

Feb 11 '24 18:02 kwohlfahrt

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

May 11 '24 19:05 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle rotten
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Jun 10 '24 19:06 k8s-triage-robot

/remove-lifecycle rotten

EDIT: Ah, sorry, got confused by my github notifications and thought this was the issue, not a PR. This can probably be closed since it hasn't seen updates in a while.

Jun 11 '24 08:06 kwohlfahrt

/close

Aug 11 '24 16:08 mloiseleur

@mloiseleur: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Aug 11 '24 16:08 k8s-ci-robot