cloud-init icon indicating copy to clipboard operation
cloud-init copied to clipboard

Feature: Kernel Modules

Open chifac08 opened this issue 2 years ago • 24 comments

Proposed Commit Message

cc_kernel_modules: Module to manage kernel modules.

This module is capable of loading a kernel module `/etc/modules-load.d` at boot and enhance 
a kernel module with modprobe parameters (https://manpages.ubuntu.com/manpages/trusty/man5/modprobe.d.5.html). 
Beside applying settings during runtime it will also persist all settings in `/etc/modules-load.d` 
as well as `/etc/modprobe.d`.

Additional Context

For examples please have look at MetaSchema.

Test Steps

Checklist:

  • [x] My code follows the process laid out in the documentation
  • [x] I have updated or added any unit tests accordingly
  • [x] I have updated or added any documentation accordingly

chifac08 avatar Sep 27 '22 20:09 chifac08

Hi cloud-init community and maintainers,

What do you think about such a feature? Any input is much appreciated!

thx

chifac08 avatar Sep 27 '22 20:09 chifac08

@chifac08 I'm assuming we'd add/remove any of the modules from /etc/modules-load.d depending on the load key, and that anything under the modprobe key would go into /etc/modprobe.d?

The idea sounds good to me!

TheRealFalcon avatar Sep 29 '22 20:09 TheRealFalcon

Hi @chifac08,

Thanks for the proposal.

A couple thoughts:

I expect the common cases for early boot to be handled during early boot by initrd/initramfs, with udev handling other common drivers as well.

I could see this being helpful on a scientific computing cloud perhaps, where you have some special hardware (gpu/accelerator/rdma nic/etc).

In the case where your hardware driver isn't upstream or in your image already, manually loading modules downloaded from a repo is arguably less complicated than writing udev rules[1] and more elegant than runcmd.

No objections from me for the idea, +1

holmanb avatar Sep 30 '22 00:09 holmanb

@chifac08 I'm assuming we'd add/remove any of the modules from /etc/modules-load.d depending on the load key, and that anything under the modprobe key would go into /etc/modprobe.d?

The idea sounds good to me!

@TheRealFalcon @holmanb Thanks for your input. Yes, your assumption is correct. Maybe i should explain my idea a bit more detailed. Let's assume we have a config like this:

kernel_modules: 
  - name: v4l2loopback
    load: true
    modprobe:
      options: "devices=1 video_nr=20 card_label=fakecam exclusive_caps=1"

The module will create following files...

  • /etc/modules-load.d/cloud-init.conf: v4l2loopback
  • /etc/modprobe.d/cloud-init.conf: options v4l2loopback devices=1 video_nr=20 card_label=fakecam exclusive_caps=1

and reload kernel modules by restarting systemd-modules-load.service and update initramfs by issuing update-initramfs -u -k all. Module will run PER INSTANCE and always overwrites config files with given elements (if there are no elements, both config files will be deleted) to ensure idempotence.

chifac08 avatar Sep 30 '22 18:09 chifac08

If we can come up with a better name for the modprobe key, we can make it platform independent from the get-go.

How about persist ?

igalic avatar Sep 30 '22 21:09 igalic

@chifac08 , I noticed you added a lot of functionality here. Is this still intended to be in draft state? Are you looking for more review before proceeding?

TheRealFalcon avatar Oct 14 '22 15:10 TheRealFalcon

@TheRealFalcon I was very busy the last weeks and wanted to make sure to have a stable version of my PR including unit and integration tests. Finally, I found time and here it is ready to review :)

Be aware, I am still working on documentation.

thx

chifac08 avatar Oct 16 '22 19:10 chifac08

Thanks, @chifac08, for improving cloud-init.

The approach looks good overall. I left some inline comments, and I have a couple of generic ones:

  1. At the end of this module, after the desired config is set we restart systemd-modules-load to let it modify the kernel modules. We verify that the exit code of systemd-modules-load is 0 and continue, so if some error happens during systemd-modules-load execution, cloud-init would not see it and report a correct execution. I think cloud-init should report back to the user if the system is misconfigured, but at the same time I do not see a way to do so without blocking cc_kernel_modules until systemd-modules-load finishes. Thoughts?
  2. This module can load/unload/tweak kernel modules that are already present on the system. I think a common question from consumers' perspective is going to be: how do I work with a module that is not present on the image? Maybe we could underline this in the documentation and hint users that they could inject new kernel modules by using bootcmd or other means.

Thank you for reviewing. I implemented all suggestions you addressed. Concerning your points:

  1. I deeply looked into documentation for systemd-modules-load.service again and made some tests. Unfortunately, /lib/systemd/systemd-modules-load executable does not return an error code unequal to zero when a module can't be loaded due to an error (only exits unequal zero if a path does not exist - Link). Furthermore, calling the unit file will always reload all config files which should not be the way to go as we only wan't to ensure that /etc/modules-load.d/cloud-init.conf gets loaded. Finally, I decided to use /lib/systemd/systemd-modules-load executable directly to explicitly load kernel modules from /etc/modules-load.d/cloud-init.conf and parse output to handle possible errors (e.g. when a module does not exist). Let me know what you think
  2. I am still writing on documentation but you are totally right. I intentionally placed the module in config_final_modules section, directly after package-update-upgrade-install, to give the user the possibility to install an kernel module an load it afterwards. For example:
#cloud-config
packages:
  - wireguard
kernel_modules:
  - name: wireguard
    load: true

Furthermore, I want to discuss the usage of TypedDict in cloud-init. TypedDict simplifies reading DEFAULT_CONFIG by defining a custom definition of an dictionary and stops mypy from complaining about collections. Unfortunately, TypedDict was moved from typing-extensions to typing lib in Python 3.8. This means, we have to check what python version is installed and import it from different libs:

if sys.version_info >= (3, 8):
    from typing import TypedDict
else:
    from typing_extensions import TypedDict

typing-extensions lib is not defined as an requirement for cloud-init, so this module will currently fail when using a Python version lower 3.8.

My questions..

  • Is the usage of TypedDict prohibited? Does someone see a better approach?
  • If this approach is ok, should i add typing-extensions to requirements.txt?

Last but not least, i want to talk about integration tests. I do not see a way to implement meaningful integration tests for LXD containers as loading kernel modules won't work in a container. Thoughts?

Thanks

chifac08 avatar Oct 30 '22 16:10 chifac08

Thanks, @chifac08, for the work here.

Having a look into the failing integration tests, I have noticed some schema validation errors.

tldr:

diff --git a/cloudinit/config/schemas/schema-cloud-config-v1.json b/cloudinit/config/schemas/schema-cloud-config-v1.json
index 378f0f941..42e3482fe 100644
--- a/cloudinit/config/schemas/schema-cloud-config-v1.json
+++ b/cloudinit/config/schemas/schema-cloud-config-v1.json
@@ -27,6 +27,7 @@
         "grub_dpkg",
         "install-hotplug",
         "install_hotplug",
+        "kernel_modules",
         "keyboard",
         "keys-to-console",
         "keys_to_console",

In #1770, we included validation of the base config modules lists that can be overridden with user data. After that change, we have to declare the new module under all_modules.

This should be documented in module_creation, but it is not. (We will fix that).

@aciba90 Implemented changes you requested. Furthermore, I added an short module description. Let me know what you think. Thanks

chifac08 avatar Nov 17 '22 21:11 chifac08

As this PR includes the documentation of a new module, I added @s-makin as a reviewer.

aciba90 avatar Nov 18 '22 11:11 aciba90

It looks great, @chifac08. A couple of minor things:

  1. To render the new module doc, we need to include it in doc/rtd/topics/modules.rst. This is not documented in https://cloudinit.readthedocs.io/en/latest/topics/module_creation.html , we will fix it.
  2. About the integration tests, the two classes only differ on the required LXD setup for containers. I would suggest to abstract that difference using a base class.

The following patch addresses both things and enables the tests to be executed on clouds. Feel free to modify it.

diff --git a/doc/rtd/topics/modules.rst b/doc/rtd/topics/modules.rst
index b0ad83e49..58bed3f02 100644
--- a/doc/rtd/topics/modules.rst
+++ b/doc/rtd/topics/modules.rst
@@ -19,6 +19,7 @@ Module Reference
 .. automodule:: cloudinit.config.cc_growpart
 .. automodule:: cloudinit.config.cc_grub_dpkg
 .. automodule:: cloudinit.config.cc_install_hotplug
+.. automodule:: cloudinit.config.cc_kernel_modules
 .. automodule:: cloudinit.config.cc_keyboard
 .. automodule:: cloudinit.config.cc_keys_to_console
 .. automodule:: cloudinit.config.cc_landscape
diff --git a/tests/integration_tests/modules/test_kernel_modules.py b/tests/integration_tests/modules/test_kernel_modules.py
index efeae8c0c..fe02bdbb7 100644
--- a/tests/integration_tests/modules/test_kernel_modules.py
+++ b/tests/integration_tests/modules/test_kernel_modules.py
@@ -37,42 +37,49 @@ def load_kernel_modules_lxd(instance: LXDInstance):
     )


[email protected]_data(USER_DATA)
 @pytest.mark.ci
[email protected]_vm
[email protected]_data(USER_DATA)
 @pytest.mark.ubuntu
-class TestKernelModules:
+class BaseTest:
     @pytest.mark.parametrize(
         "cmd,expected_out",
         (
-            # check permissions
-            (
+            pytest.param(
                 "stat -c '%U %a' /etc/modules-load.d/50-cloud-init.conf",
                 r"root 600",
+                id="check-permissions",
             ),
-            (
+            pytest.param(
                 "stat -c '%U %a' /etc/modprobe.d/50-cloud-init.conf",
                 r"root 600",
+                id="check-permissions-2",
+            ),
+            pytest.param(
+                "file /etc/modules-load.d/50-cloud-init.conf",
+                ASCII_TEXT,
+                id="ASCII-check",
             ),
-            # ASCII check
-            ("file /etc/modules-load.d/50-cloud-init.conf", ASCII_TEXT),
-            ("file /etc/modprobe.d/50-cloud-init.conf", ASCII_TEXT),
-            # check loaded modules
-            (
+            pytest.param(
+                "file /etc/modprobe.d/50-cloud-init.conf",
+                ASCII_TEXT,
+                id="ASCII-check-2",
+            ),
+            pytest.param(
                 "lsmod | grep -e '^lockd\\|^ip_tables\\|^wireguard' | wc -l",
                 "3",
+                id="check-loaded-modules",
             ),
-            # sha256sum check modul
-            (
+            pytest.param(
                 "sha256sum </etc/modules-load.d/50-cloud-init.conf",
                 "9d14d5d585dd3e5e9a3c414b5b7af7ed"
                 "9d44e7ee3584652fbf388cad455b5053",
+                id="sha256sum-check-modules",
             ),
-            # sha256sum check modprobe
-            (
+            pytest.param(
                 "sha256sum   </etc/modprobe.d/50-cloud-init.conf",
                 "229ccc941ec34fc8c49bf14285ffeb65"
                 "ea2796c4840f9377d6df76bda42c878e",
+                id="sha256sum-check-modprobe",
             ),
         ),
     )
@@ -88,53 +95,12 @@ class TestKernelModules:
         verify_clean_log(log)


[email protected]
[email protected]_container
+class TestKernelModules(BaseTest):
+    pass
+
+
 @pytest.mark.lxd_container
[email protected]_data(USER_DATA)
 @pytest.mark.lxd_setup.with_args(load_kernel_modules_lxd)
[email protected]
-class TestKernelModulesWithoutKmod:
-    @pytest.mark.parametrize(
-        "cmd,expected_out",
-        (
-            # check permissions
-            (
-                "stat -c '%U %a' /etc/modules-load.d/50-cloud-init.conf",
-                r"root 600",
-            ),
-            (
-                "stat -c '%U %a' /etc/modprobe.d/50-cloud-init.conf",
-                r"root 600",
-            ),
-            # ASCII check
-            ("file /etc/modules-load.d/50-cloud-init.conf", ASCII_TEXT),
-            ("file /etc/modprobe.d/50-cloud-init.conf", ASCII_TEXT),
-            # check loaded modules
-            (
-                "lsmod | grep -e '^lockd\\|^ip_tables\\|^wireguard' | wc -l",
-                "3",
-            ),
-            # sha256sum check modul
-            (
-                "sha256sum </etc/modules-load.d/50-cloud-init.conf",
-                "9d14d5d585dd3e5e9a3c414b5b7af7ed"
-                "9d44e7ee3584652fbf388cad455b5053",
-            ),
-            # sha256sum check modprobe
-            (
-                "sha256sum   </etc/modprobe.d/50-cloud-init.conf",
-                "229ccc941ec34fc8c49bf14285ffeb65"
-                "ea2796c4840f9377d6df76bda42c878e",
-            ),
-        ),
-    )
-    def test_kernel_modules(
-        self, cmd, expected_out, class_client: IntegrationInstance
-    ):
-        result = class_client.execute(cmd)
-        assert result.ok
-        assert expected_out in result.stdout
-
-    def test_clean_log(self, class_client: IntegrationInstance):
-        log = class_client.read_from_file("/var/log/cloud-init.log")
-        verify_clean_log(log)
+class TestKernelModulesWithoutKmod(BaseTest):
+    pass

@aciba90 @s-makin Thank you for reviewing. Implemented the changes you requested, please review again.

chifac08 avatar Nov 19 '22 13:11 chifac08

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.)

github-actions[bot] avatar Dec 06 '22 00:12 github-actions[bot]

@TheRealFalcon I would be delighted if you could review my PR. Thank you!

chifac08 avatar Dec 06 '22 08:12 chifac08

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.)

github-actions[bot] avatar Dec 28 '22 00:12 github-actions[bot]

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.)

github-actions[bot] avatar Jan 22 '23 00:01 github-actions[bot]

@blackboxsw and @chifac08 , I have a feeling you're both waiting for the other. @chifac08 , since your last push, is this ready for re-review?

TheRealFalcon avatar Jan 23 '23 21:01 TheRealFalcon

@TheRealFalcon @blackboxsw Current implementation is ready to re-review as i only implemented addressed changes. As already mentioned above, i would need some help if we want to make this module usable for other distros as i have no idea how to implement it.

thank you for your help

chifac08 avatar Jan 25 '23 20:01 chifac08

BSDs don't have a concept of an initramfs, except as a special kernel that runs before the real system to do tasks like growfs on OpenBSD and NetBSD. The only thing that comes close is regenerating the hints files on FreeBSD using kldxref(8) (or running the service) after installing new kernels (not really a thing, unless you're using PkgBase, or just rebuilt the system, or are sharing a build between machine), or new kernel modules (mostly useful for desktop systems)

OpenBSD got rid of loadable kernel modules about ten years ago

igalic avatar Jan 28 '23 14:01 igalic

@blackboxsw thank you for your help and detailed explanation. I got an good overview about Distro classes and how to implement such a feature next time :). Would you mind applying your diff? Otherwise I could do it. Pls let me know.

chifac08 avatar Feb 05 '23 20:02 chifac08

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.)

github-actions[bot] avatar Feb 26 '23 00:02 github-actions[bot]

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.)

github-actions[bot] avatar Mar 22 '23 00:03 github-actions[bot]

Hi maintainers,

Would you mind re-opening this PR? I really want to finish my work here and make sure it make it's way in the upstream.

thank your for time!

chifac08 avatar Jan 29 '24 19:01 chifac08

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.)

github-actions[bot] avatar Feb 26 '24 00:02 github-actions[bot]

Sorry for delay here, while to updates look reasonable. I don't want to land this in tip of main until after our 24.1 release is cut to avoid risk on this upcoming release. Expectation is we should be cutting 24.1 this week. We can finish this review and merge after that occurs.

blackboxsw avatar Feb 28 '24 04:02 blackboxsw

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.)

github-actions[bot] avatar Mar 14 '24 00:03 github-actions[bot]

@blackboxsw or @aciba90 , are we waiting on anything else here?

TheRealFalcon avatar Mar 14 '24 12:03 TheRealFalcon

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.)

github-actions[bot] avatar Mar 29 '24 00:03 github-actions[bot]