bottlerocket icon indicating copy to clipboard operation
bottlerocket copied to clipboard

update netdog to a more recent version of the toml library

Open webern opened this issue 2 years ago • 6 comments

Netdog cannot be easily upgraded to newer versions of the toml library because the feature perserve_order no longer exists in newer versions of toml. When I tried removing the cargo feature and updating it to 0.7, a few tests failed.

webern avatar Sep 06 '23 20:09 webern

I wasn't immediately sure why order mattered, so documenting what I found in the code:

#[derive(Debug, Deserialize)]
pub(crate) struct NetConfigV1 {
    // Use an IndexMap to preserve the order of the devices defined in the net.toml.  The TOML
    // library supports this through a feature making use of IndexMap.  Order is important because
    // we use the first device in the list as the primary device if the `primary` key isn't set for
    // any of the devices.

markusboehme avatar Jan 08 '24 11:01 markusboehme

On a brief look I couldn't find the preserve_order feature of toml being removed. I updated the lib and produced a metal-dev build that passes the unit tests. I can send a PR for that. There's a bunch of other dogs that use the old version of toml. @webern Do you know why those might have been kept back?

markusboehme avatar Jan 08 '24 13:01 markusboehme

I think that is fine. I think if unit tests are passing for all variants the we should be good. Net.toml is well tested and also the code has been worked on a fair amount since I opened this issue. If the tests are passing now then it would seem the behavior is fine.

Do you know why those might have been kept back?

I think it is because we have a weird lint that makes us feel like using two versions of a library is a bad thing. I'm not a fan of the lint (in cargo-deny).

webern avatar Jan 08 '24 18:01 webern

Do you know why those might have been kept back?

I think it is because we have a weird lint that makes us feel like using two versions of a library is a bad thing. I'm not a fan of the lint (in cargo-deny).

I may be misunderstanding you, but I'm not sure this would be the reason--there's already two different versions of the crate in use.

markusboehme avatar Jan 08 '24 18:01 markusboehme

Currently we do conditional compilation for netdog and I'm not confident we run all unit tests right now automatically that might catch this. One can run them manually to ensure this is working as expected by passing different backend options with something similar to VARIANT=metal-dev SYSTEMD_NETWORKD=1 cargo test vs VARIANT=metal-dev SYSTEMD_NETWORKD=0 cargo test from the sources/api/netdog directory to ensure all the tests are run when testing this update. Order is really important here for the primary interface and we will break users if this changes behavior so its worth being a little extra cautious while we bump this.

yeazelm avatar Jan 08 '24 19:01 yeazelm

The build bot should run all unit tests on PR submission: It runs across all variants, and the build does include a cargo make unit-tests. The variants shipping k8s 1.28 set the systemd-networkd feature flag.

Personally, I think the "first interface is primary unless explicitly specified" is a bit of a misfeature, but one we're stuck with. At least the requirement for ordering is documented in the code (see first comment).

markusboehme avatar Jan 09 '24 10:01 markusboehme

Closed by #3830

webern avatar Mar 22 '24 16:03 webern