netplan Add DuplicateAddressDetection settings for systemd-networkd (LP: #1959190)

Description

Adding DuplicateAddressDetection parameter to be able to configure the DAD as by default it is enabled for ipv6 and ipv4ll.

It should kinda close LP#1959190 as it would provide a way to configure it, but not setting any default value to avoid breaking network configurations.

Checklist

[x] Runs make check successfully.
[x] Retains code coverage (make check-coverage).
[x] New/changed keys in YAML format are documented.
[ ] (Optional) Adds example YAML for new feature.
[x] (Optional) Closes an open bug in Launchpad.

Oct 19 '24 21:10 sanecz

We might also consider enabling this feature by default, similarly as it was done in NetworkManager by Fedora 40: https://fedoraproject.org/wiki/Changes/Enable_IPv4_Address_Conflict_Detection

E.g. defaulting to duplicate-address-detection: [ipv4, ipv6] (instead of [ipv4-ll, ipv6]). This would be a breaking change that we'd need to announce in the release notes and maybe relax for backports of the new version. But I still think it might be a feasible approach, as not having DAD enable could lead to broken network configurations, which isn't any better.

Oct 24 '24 14:10 slyon

Hi @slyon

Thank you for your feedback, really appreciate it !

As a non-blocking requirement, we should also consider adding: nm.c (NetworkManager) backend renderer

For the NetworkManager render, as I do not have much knowledge on it, I'll take a little time to explore and experiment configurations for it.

Integration test(s), e.g. in tests/integration/ethernets.py

Sure I'll take a look and implement it.

[...] Also, we should probably choose something more specific than a string (char*) to store the data internally.

Agreed, as you mentioned on a later comment, an enum would do perfectly the job.

As an "ipv6" or "both" setting doesn't make sense for a specific IPv4 address and vice versa.

I entirely agree. What is the scope of netplan when dealing with configuration that is syntactically correct but does not makes sense ? Should it be the responsibility of the choosen network backend to check, or should netplan do some coherence checks ? [1] Also 'both' might come handy for automation if you don't want to check the type of the IP when writing the configuration (clearly what I did too on the network configurations of my servers without thinking too much ;D)

[...] I don't think the DAD-per-address is the optimal solution here. [...] Besides that, I think the new duplicate-address-detection setting should move to the per-NetDef/interface level. [..] Furthermore, NetworkManager apparently only allows defining this per-connection, not per-address via its ipv4/6.dad-timeout settings.

It seems that systemd-networkd and NetworkManager do really differs on the way of configuring it, as networkd mentions DAD configuration only on the address section option and there is only one address declared by section, and not much more, while with networkmanager as you previously mentioned, to enable the dad per interfaces it is enabled by setting the dad-timeout to a value > 0 for ipv4 and ipv6. So it's not easy to think about a one-fit-both configuration. [2]

This also means that another (optional) configuration key only available for nm should exists to set the timeout instead of a fixed value. The networkd configuration does not seem to have an equivalence. (I think only the number of probes sent can be changed)

duplicate-address-detection: [ipv4-ll, ipv6] # could be [ipv4], [ipv6], [ipv4, ipv6], or the empty list "[]"

True, this schema seems clearer than setting directly the networkd keywords, especially if nm is supported too !

We might also consider enabling this feature by default, similarly as it was done in NetworkManager by Fedora 40 E.g. defaulting to duplicate-address-detection: [ipv4, ipv6] (instead of [ipv4-ll, ipv6]). This would be a breaking change that we'd need to announce in the release notes and maybe relax for backports of the new version. But I still think it might be a feasible approach, as not having DAD enable could lead to broken network configurations, which isn't any better.

On purpose I didn't set a default value to use the default behavior of the network backend thus avoiding issues for some configurations too (and some people that might ask why it takes few seconds longer to up the network too).

--

Howerver, as mentioned on [1], if netplan does coherence check, we can manage to check the family of the ip and set the only proper configuration keys per address/interface or throw an error or a warning.

About the setting [2], I'm still not sure what could be the best idea:

if we set the configuration per netdef/interface, we loose some of the flexibility of systemd-networkd (probably only for unusual configurations or some cases I can't really think about ?)
if we set the configuration per address, someone using nm could set on an addres "duplicate-address-detection: none" and on another address "duplicate-address-detection: ipv4" and I don't really see how can we generate a configuration for this case.
support both models, disable the per address for nm

I'm looking forward to hearing your ideas !

Nov 07 '24 00:11 sanecz

Hi @slyon

Thank you for your feedback, really appreciate it !

As a non-blocking requirement, we should also consider adding: nm.c (NetworkManager) backend renderer

For the NetworkManager render, as I do not have much knowledge on it, I'll take a little time to explore and experiment configurations for it.

As mentioned, I see this as non-blocking. It's fine to focus on systemd-networkd for now. The implementation for NetworkManager could then be a follow-up PR.

Integration test(s), e.g. in tests/integration/ethernets.py

Sure I'll take a look and implement it.

Thanks!

[...] Also, we should probably choose something more specific than a string (char*) to store the data internally.

Agreed, as you mentioned on a later comment, an enum would do perfectly the job.

As an "ipv6" or "both" setting doesn't make sense for a specific IPv4 address and vice versa.

I entirely agree. What is the scope of netplan when dealing with configuration that is syntactically correct but does not makes sense ? Should it be the responsibility of the choosen network backend to check, or should netplan do some coherence checks ? [1]

IMO coherence checks should happen on the highest layer (e.g. inside Netplan), we don't want our users to dig down into the stack, to understand that they described an invalid configuration. We should make it obvious from the very start. In Netplan we have the src/validation.c stage which can be used for this. When some configuration is invalid, but not harmful, it might just log a warning (probably the case here). OTOH if Netplan can already tell that an invalid configuration will not work, it should error out from the validation stage.

Also 'both' might come handy for automation if you don't want to check the type of the IP when writing the configuration (clearly what I did too on the network configurations of my servers without thinking too much ;D)

Sure, but I think we could cover this with the schema I suggested above, e.g.: duplicate-address-detection: [ipv4, ipv6]

[...] I don't think the DAD-per-address is the optimal solution here. [...] Besides that, I think the new duplicate-address-detection setting should move to the per-NetDef/interface level. [..] Furthermore, NetworkManager apparently only allows defining this per-connection, not per-address via its ipv4/6.dad-timeout settings.

It seems that systemd-networkd and NetworkManager do really differs on the way of configuring it, as networkd mentions DAD configuration only on the address section option and there is only one address declared by section, and not much more, while with networkmanager as you previously mentioned, to enable the dad per interfaces it is enabled by setting the dad-timeout to a value > 0 for ipv4 and ipv6. So it's not easy to think about a one-fit-both configuration. [2]

This also means that another (optional) configuration key only available for nm should exists to set the timeout instead of a fixed value. The networkd configuration does not seem to have an equivalence. (I think only the number of probes sent can be changed)

Yes, unfortunately there are many such nuanced difference between networking backends. With Netplan we try unify them in a best effort approach. For this DAD case, I could think of setting NM's ipv4/6.dad-timeout to something like 200ms when [ipv4/6] gets enabled through Netplan. Logging a warning when ipv4-ll is select, which we cannot clearly map to NetworkManager. The users would then still have the ability to override this default, using networkmanager.passthrough.ipv4.data-timeout settings.

duplicate-address-detection: [ipv4-ll, ipv6] # could be [ipv4], [ipv6], [ipv4, ipv6], or the empty list "[]"

True, this schema seems clearer than setting directly the networkd keywords, especially if nm is supported too !

Thanks! I've also got +1 from Steve, our architect, about this (after some back channel discussions). So let's go with that. We just need to make sure all the options are clearly documented in doc/netplan-yaml.md (e.g. ipv4-ll is also valid by itself, ipv4 is a superset, including "ipv4-ll").

We might also consider enabling this feature by default, similarly as it was done in NetworkManager by Fedora 40 E.g. defaulting to duplicate-address-detection: [ipv4, ipv6] (instead of [ipv4-ll, ipv6]). This would be a breaking change that we'd need to announce in the release notes and maybe relax for backports of the new version. But I still think it might be a feasible approach, as not having DAD enable could lead to broken network configurations, which isn't any better.

On purpose I didn't set a default value to use the default behavior of the network backend thus avoiding issues for some configurations too (and some people that might ask why it takes few seconds longer to up the network too).

I agree. Let's try not to change any default for now. I'd suggest going with [ipv4-ll, ipv6] still, as that reflects the current behaviour. On NM this might potentially translate to ipv4.dad-timeout=-1 (as we cannot reflect "ipv4-ll" there) & ipv6.dad-timeout=200 (or maybe a better default timeout).

--

Howerver, as mentioned on [1], if netplan does coherence check, we can manage to check the family of the ip and set the only proper configuration keys per address/interface or throw an error or a warning.

About the setting [2], I'm still not sure what could be the best idea:

if we set the configuration per netdef/interface, we loose some of the flexibility of systemd-networkd (probably only for unusual configurations or some cases I can't really think about ?)

if we set the configuration per address, someone using nm could set on an addres "duplicate-address-detection: none" and on another address "duplicate-address-detection: ipv4" and I don't really see how can we generate a configuration for this case.

support both models, disable the per address for nm

I'm looking forward to hearing your ideas !

Right, there's a balance to strike here. With the proposed duplicate-address-detection: [ipv4-ll, ipv6] # could also be [ipv4], [ipv6], [ipv4, ipv6], or the empty list "[]" schema we're doing pretty good, IMO. It allows for some granularity on IP-address-type/range and we would also be able to extend it with more values in the future, e.g. "ipv6-ll", "ipv4-no-ll", ... (if needed).

Nov 11 '24 12:11 slyon

@sanecz Do the above comments clarify the next steps for you? Should there be any specific issues, or blocks, feel free to ask about it in here!

Using the recommended schema of duplicate-address-detection: [ipv4-ll, ipv6] # could be [ipv4], [ipv6], [ipv4, ipv6], or the empty list "[]" was approved by the Netplan architect.

Nov 28 '24 15:11 slyon

Hello @slyon, yes it is clear! Thank you :) I'm on it but not much time recently :/

Great news for approval of the schema!

Nov 29 '24 16:11 sanecz