chef-datadog icon indicating copy to clipboard operation
chef-datadog copied to clipboard

Update to current Chef community conventions

Open jeffbyrnes opened this issue 3 years ago • 39 comments

~Lots of cleanup here, primarily driving towards Chef 17 compatibility.~

Provides:

  • Updated support/config files, consistent w/ current Chef community conventions
  • Unit tests are performed against latest Chef
  • Integration tests are performed against multiple versions of Chef
  • Linting is done via cookstyle

[Update 2022-08-25]

Buried in a comment way down is this, from me:

I looked at the unit test output for https://github.com/DataDog/chef-datadog/pull/852, and it may be that this particular unit test didn’t get run? Quite a few resources were untouched by that ChefSpec run.

I ran the unit tests locally against https://github.com/DataDog/chef-datadog/commit/e77a6c3e8cf203ffdabafb8167932d948ceafe74 & a lot of the tests failed so… I think maybe the nature of how the tests are run on main is obfuscating things.

This PR, in part, fixes that & clears up the running of tests, and also ensures the integration tests run against multiple distros, distro versions, Chef versions, and Datadog Agent versions, which the current main branch does not do (despite appearances).

Something to note that this PR brings to bear. I’ve not attempted to fix these test failures, b/c I think that’s out-of-scope for these changes, but we should be mindful all the same.

[Update 2022-09-13]

Figured out a bit more about the unit tests while doing a larger cleanup of them (a separate PR will appear for that once this one actually merges).

Check the individual commits to see what sorted that out in the end.

[Update 2022-10-12]

  • Rebased against main branch again
  • Added cookstyle --autofix lint fixes to make builds totally green

jeffbyrnes avatar Sep 14 '21 19:09 jeffbyrnes

Hi @jeffbyrnes! In my tests, the cookbook seems to work fine on Chef 17 without any changes. Did you find it was not the case for you? If there's any change that is needed, we should find a way to incorporate it while still keeping compatibility with older versions of Chef, though.

albertvaka avatar Sep 21 '21 16:09 albertvaka

@albertvaka Chef 17 wants resources to set unified_mode true for compatibility, and that also means using yum >= 7.0.0 (b/c that’s the first version that’s Chef 17-compatible).

As for older versions of Chef, only Chef 16 & Chef 17 are supported right now, Chef <= 15 are end-of-life. Also, testing across a matrix of that many Chef version is challenging, since CircleCI has limits on matrix size, and GitHub Actions charges based on action usage.

Relatedly, the test config wasn’t testing this against current OS versions (e.g., CentOS 8, Debian 10, Ubuntu 20.04), so that seems like a good thing to add, too.

Still, it’s y’all’s cookbook! I was following the example of the community (mainly as exemplified by @sous-chefs) in terms of cookbooks.

I’ll spin up a new branch & add the bare minimum of changes to try things out on Chef 17, and we can see what that looks like.

jeffbyrnes avatar Sep 21 '21 16:09 jeffbyrnes

So just to see what’s different between our testing, I took a look through the log from the CircleCI Kitchen test from your commit adding Chef 17 testing, in 7b39ef2 and, uh, it seems as though it’s passing b/c it’s not actually testing against Chef 17. Or 16. Or any version except Chef 14:

🕙 16:35:21 jeffbyrnes in datadog on  bare-minimum-chef-17 [?] via  v3.0.0 
♠ grep 'packages.chef.io' ~/Downloads/datadog-log.txt 
       url	https://packages.chef.io/files/stable/chef/14.12.9/ubuntu/14.04/chef_14.12.9-1_amd64.deb
       downloading https://packages.chef.io/files/stable/chef/14.12.9/ubuntu/14.04/chef_14.12.9-1_amd64.deb
       url	https://packages.chef.io/files/stable/chef/14.12.9/el/6/chef-14.12.9-1.el6.x86_64.rpm
       downloading https://packages.chef.io/files/stable/chef/14.12.9/el/6/chef-14.12.9-1.el6.x86_64.rpm
       url	https://packages.chef.io/files/stable/chef/14.12.9/el/7/chef-14.12.9-1.el7.x86_64.rpm
       downloading https://packages.chef.io/files/stable/chef/14.12.9/el/7/chef-14.12.9-1.el7.x86_64.rpm
       url	https://packages.chef.io/files/stable/chef/14.12.9/debian/8/chef_14.12.9-1_amd64.deb
       downloading https://packages.chef.io/files/stable/chef/14.12.9/debian/8/chef_14.12.9-1_amd64.deb
       url	https://packages.chef.io/files/stable/chef/14.12.9/el/6/chef-14.12.9-1.el6.x86_64.rpm
       downloading https://packages.chef.io/files/stable/chef/14.12.9/el/6/chef-14.12.9-1.el6.x86_64.rpm
       url	https://packages.chef.io/files/stable/chef/14.12.9/el/7/chef-14.12.9-1.el7.x86_64.rpm
       downloading https://packages.chef.io/files/stable/chef/14.12.9/el/7/chef-14.12.9-1.el7.x86_64.rpm
       url	https://packages.chef.io/files/stable/chef/14.12.9/el/6/chef-14.12.9-1.el6.x86_64.rpm
       downloading https://packages.chef.io/files/stable/chef/14.12.9/el/6/chef-14.12.9-1.el6.x86_64.rpm
       url	https://packages.chef.io/files/stable/chef/14.12.9/el/7/chef-14.12.9-1.el7.x86_64.rpm
       downloading https://packages.chef.io/files/stable/chef/14.12.9/el/7/chef-14.12.9-1.el7.x86_64.rpm

So… this PR may be revealing that this cookbook isn’t being tested the way we think it’s being tested, and showing us that things are in fact broken in ways that were previously hidden.

I’ve rebased my branch so it can merge again, and dropped the GitHub Actions bits b/c y’all like to use CircleCI. That said, it’s easier to see what’s breaking via GH Actions, b/c they allow for a larger matrix of tests. You can check that out in my fork. In many cases, it appears to be b/c a particular package is missing for a specific distro version.

I’ll see if I can figure out why the unit tests aren’t working on CircleCI, but for me locally:

Finished in 38.24 seconds (files took 2.53 seconds to load)
815 examples, 0 failures

Randomized with seed 34741

[Coveralls] Outside the CI environment, not sending data.

jeffbyrnes avatar Sep 21 '21 20:09 jeffbyrnes

Figured out the unit tests: they needed to have CHEF_LICENSE: accept-no-persist set in their environment.

jeffbyrnes avatar Sep 21 '21 20:09 jeffbyrnes

Hmm, looks like adjusting things to work on Chef 17 means dropping Chef 14 support:

[2021-09-21T20:54:07+00:00] FATAL: Chef::Exceptions::CookbookChefVersionMismatch: Cookbook 'yum' version '7.1.0' depends on chef version [">= 15.3"], but the running chef version is 14.15.6

jeffbyrnes avatar Sep 21 '21 20:09 jeffbyrnes

I'm using kitchen to test, where we have different OSes and Chef versions. Since the unified_mode warnings are non-critical, I would rather keep compatibility with Chef 14 given the amount of people that still uses that version (even though it's EOLd).

albertvaka avatar Sep 22 '21 08:09 albertvaka

@albertvaka well, based on what CircleCI is doing with Kitchen, are you sure that it’s actually running against the newer versions of Chef? CircleCI is supposedly doing the same thing you are locally, but it’s doing all of it using Chef 14, which makes me wonder.

jeffbyrnes avatar Sep 22 '21 19:09 jeffbyrnes

Circle doesn't run all the available tests.

albertvaka avatar Sep 23 '21 09:09 albertvaka

@albertvaka ah! With the changes I made in 3f84d20fa4b6aa93f6022a2f592ec97e4144e62e & f769ecf9fa59171f396c6816f2d8a03129abacab, now it does! However, to fit them into the limits that CircleCI places, the jobs break down into integration-${agent_major_version}-${chef_version}-${platform}-${platform_version}

This means that each job runs every suite, meaning if any one suite fails, the entire job fails.

The actual list of failing suites:

  • datadog-integrations, amazonlinux-2
  • datadog-integrations, centos-7
  • datadog-integrations, centos-8
  • datadog-integrations, debian-10
  • datadog-integrations, debian-9
  • datadog-integrations, ubuntu-1804
  • datadog-integrations, ubuntu-2004
  • dd-agent-iot, amazonlinux-2
  • dd-agent-iot, centos-7
  • dd-agent-iot, centos-8
  • dd-agent-iot, debian-10
  • dd-agent-iot, debian-9
  • dd-agent-iot, ubuntu-1804
  • dd-agent-iot, ubuntu-2004
  • dd-integration-resource, amazonlinux-2
  • dd-integration-resource, centos-7
  • dd-integration-resource, centos-8
  • dd-integration-resource, debian-10
  • dd-integration-resource, debian-9
  • dd-integration-resource, ubuntu-1804
  • dd-integration-resource, ubuntu-2004
  • system-probe, debian-10
  • system-probe, ubuntu-1804
  • system-probe, ubuntu-2004

You can see that datadog-integrations, dd-integration-resource, & dd-agent-iot break on every run, the error there is that the package is missing. The system-probe is more limited in terms of which platform + version combos it fails on.

jeffbyrnes avatar Sep 23 '21 13:09 jeffbyrnes

@albertvaka I’d forgotten that the main reason I went ahead & refactored the Kitchen config was that I was unable to run the tests as they are configured on the main branch, instead I get this error

♠ chef exec kitchen list
$$$$$$ Deprecated configuration detected:
require_chef_omnibus
Run 'kitchen doctor' for details.

/opt/chef-workstation/embedded/lib/ruby/gems/3.0.0/gems/chef-config-17.4.38/lib/chef-config/workstation_config_loader.rb:122:in `pwd': Too many open files - getcwd (Errno::EMFILE)
	from /opt/chef-workstation/embedded/lib/ruby/gems/3.0.0/gems/chef-config-17.4.38/lib/chef-config/workstation_config_loader.rb:122:in `locate_local_config'
	from /opt/chef-workstation/embedded/lib/ruby/gems/3.0.0/gems/chef-config-17.4.38/lib/chef-config/workstation_config_loader.rb:54:in `config_location'
	from /opt/chef-workstation/embedded/lib/ruby/gems/3.0.0/gems/chef-config-17.4.38/lib/chef-config/workstation_config_loader.rb:76:in `load'
	from /opt/chef-workstation/embedded/lib/ruby/gems/3.0.0/gems/test-kitchen-3.0.0/lib/kitchen/provisioner/chef_base.rb:239:in `initialize'
[snip]

If I reconfigure it so that setting the Chef version is injected via Env Var, like I did on this branch, then I’m able to run them. I’ll post the results once it finishes them all.

jeffbyrnes avatar Sep 23 '21 14:09 jeffbyrnes

@albertvaka finally came back to this, rebuilt the branch to drop any modifications towards Chef >= 17.

Instead, this now aims to clean up the unit tests, run the linter using cookstyle, and run a combo of test suites that vary based on:

  • Chef versions 12–17
  • Agent versions 5–7
  • Last 2 major OS versions (aka currently supported)
    • While it’d be great to cover further back, this is a concession to CircleCI only allowing so many elements in a test matrix.

This also means that each job is running every suite, to fit into CircleCI’s matrix limits, while still testing the other variations desired (Chef versions, agent versions). As I mentioned prior, that means the job names correspond to:

integration-${agent_major_version}-${chef_version}-${platform}-${platform_version}

jeffbyrnes avatar Oct 09 '21 16:10 jeffbyrnes

So, across all of the tests, the same two suites fail:

  • dd-agent-iot
  • datadog-integrations

On a few distros, the system-probe suite also fails.

Looking at the CircleCI tests that have been passing on the main branch lately, it turns out that these are the only test suites being run:

  • dd-agent-handler5-ubuntu-1404
  • dd-agent-handler5-centos-66
  • dd-agent-handler5-centos-77
  • dd-agent-handler5-debian-811
  • dd-agent-handler6-centos-66
  • dd-agent-handler6-centos-77
  • dd-agent-handler7-centos-66
  • dd-agent-handler7-centos-77

So it seems I’ve changed things to be a far more comprehensive set of test suites; perhaps that’s not useful?

jeffbyrnes avatar Oct 10 '21 16:10 jeffbyrnes

Looking at the most recent, as of this writing, CircleCI Kitchen tests, we can see that integration tests are done on a very limited set of combos of Chef versions, Datadog Agent versions, and OS distros/versions (extra line breaks added for readability):

♠ grep '\(Creating <\|Installing Chef\)' ~/Downloads/datadog-circleci.txt
-----> Creating <dd-agent-handler5-ubuntu-1404>...
-----> Installing Chef 14.12 package

-----> Creating <dd-agent-handler5-centos-66>...
-----> Installing Chef 14.12 package

-----> Creating <dd-agent-handler5-centos-77>...
-----> Installing Chef 14.12 package

-----> Creating <dd-agent-handler5-debian-811>...
-----> Installing Chef 14.12 package

-----> Creating <dd-agent-handler5-rocky-8>...
-----> Installing Chef 16.17.4 package

-----> Creating <dd-agent-handler6-centos-66>...
-----> Installing Chef 14.12 package

-----> Creating <dd-agent-handler6-centos-77>...
-----> Installing Chef 14.12 package

-----> Creating <dd-agent-handler6-rocky-8>...
-----> Installing Chef 16.17.4 package

-----> Creating <dd-agent-handler7-centos-66>...
-----> Installing Chef 14.12 package

-----> Creating <dd-agent-handler7-centos-77>...
-----> Installing Chef 14.12 package

-----> Creating <dd-agent-handler7-rocky-8>...
-----> Installing Chef 16.17.4 package

I’ll narrow this down to only test those combinations, to create parity with what’s tested currently, rather than attempt to expand testing to additional combinations, as this PR currently does.

jeffbyrnes avatar Mar 06 '22 20:03 jeffbyrnes

And finally, all passing, with the exact same set of Kitchen suites that’s used on main right now.

jeffbyrnes avatar Mar 07 '22 14:03 jeffbyrnes

@albertvaka @jtappa refactored this significantly, updated the description to be relevant again.

This also provides fixes/changes that #838 #724 provide (albeit slightly differently).

jeffbyrnes avatar Mar 07 '22 14:03 jeffbyrnes

Hi @jeffbyrnes, thanks for the PR! The huge amount of changes (132 files!) makes it a bit hard to review, though, specially since most of them are only style/linting changes and don't change the behavior, but some other changes do. Do you think it would be possible to split this in smaller PRs? Eg: one for the CI changes, another one for tests, for the docs... and once all the behavior changes are merged we can do a final one just with the style changes which should be super fast to review on its own.

albertvaka avatar Mar 28 '22 14:03 albertvaka

@albertvaka absolutely! And yes, the style changes should be separate, to ensure any potential changes or regressions are clearly seen as caused by those vs. other adjustments.

I’ll try & spend some more time this week.

jeffbyrnes avatar Mar 28 '22 14:03 jeffbyrnes

@albertvaka ok, trimmed down this initial PR to a much smaller set of changes, focused on some reorg & cleanup, esp. of the unit tests.

jeffbyrnes avatar Mar 28 '22 20:03 jeffbyrnes

@albertvaka might be better to get #841 sorted first, since all the tests here aren’t going to work anymore due to the removal of things like Gemfile.

jeffbyrnes avatar Mar 28 '22 22:03 jeffbyrnes

Why is the Gemfile file being removed? I think we should keep it.

albertvaka avatar Mar 29 '22 16:03 albertvaka

@albertvaka using a Gemfile hasn’t been considered a good practice for Chef cookbooks for some time. ChefCo pivoted to Chef Workstation (née ChefDK) in 2014 as the supported & recommended way to develop cookbooks & library code.

I mentioned getting #841 approved & merged first b/c it takes care of updating the actual test elements to work without a Gemfile. It was a little tricky to break things up as much as I did, and it may make sense to collapse #841 back into this PR if you’d like to see green builds.

jeffbyrnes avatar Mar 29 '22 18:03 jeffbyrnes

Expanding on “why is a Gemfile not a thing Chef recommends anymore” it’s mostly to do w/ ensuring the version of Ruby, and the gems installed, will play nicely with Chef in a firmly dependable way. They vendor a copy of Ruby & the various gems that Chef needs as part of Chef Workstation, and this is as similar as possible to a production Chef Client install on a Chef Node.

Doing it with a Gemfile leaves one open to a lot more variation in your development environment vs. the production one (in terms of Ruby version, gems installed & gem versions installed, etc).

jeffbyrnes avatar Mar 29 '22 18:03 jeffbyrnes

So yes, testing on older versions of Chef is absolutely still possible. Like I’ve mentioned, we may want to #841 merged in prior to this one, since it'll become more clear how this can continue to be tested against various versions of Chef.

jeffbyrnes avatar Mar 31 '22 14:03 jeffbyrnes

Decided to fold #841 into this as b243d8b, to try & get a better test outcome.

jeffbyrnes avatar Mar 31 '22 16:03 jeffbyrnes

👍 Makes sense. I'm all for modernizing this setup (so thank you for that!) but I'm worried about losing coverage for older chefs that many people use...

albertvaka avatar Mar 31 '22 16:03 albertvaka

@albertvaka oh for sure! There’s no loss in coverage though, this is testing all the same older versions of Chef. It also allows for the set of tests run to be expanded further in a way that’s more consistent & simpler than the previous implementation.

jeffbyrnes avatar Mar 31 '22 18:03 jeffbyrnes

FYI the unit tests are failing b/c they're outdated:

Fauxhai::Exception::InvalidPlatform:
  Could not find platform 'centos/6.9' on the local disk and an HTTP error was encountered when fetching from Github. A list of available platforms is available at https://github.com/chefspec/fauxhai/blob/master/PLATFORMS.md

I’ll tweak that so they work properly again.

jeffbyrnes avatar Mar 31 '22 18:03 jeffbyrnes

@albertvaka the linter won’t pass until we do the other style changes that you asked be taken out of this PR, so if you’re all good here, it’d be great to merge this in & move on to those changes, and perhaps also expand what the integration tests do to cover more combinations of Chef version + Linux distro.

jeffbyrnes avatar Apr 01 '22 16:04 jeffbyrnes

@axl89 rebased this, resolved my conflicts, and went ahead & merged in your PR and set the merge commit to close #854. Let’s see if it passes tests!

jeffbyrnes avatar Jul 14 '22 19:07 jeffbyrnes

As a reminder, the linting is not expected to pass, as this PR purposefully leaves all of the ChefSpec corrections out to avoid conflating multiple kinds of far-reaching changes.

jeffbyrnes avatar Jul 14 '22 19:07 jeffbyrnes