kubevirt icon indicating copy to clipboard operation
kubevirt copied to clipboard

add device plugin interface

Open aerosouund opened this issue 2 years ago • 95 comments

What this PR does

Before this PR:

All methods on plugins are implemented without a common interface to unify testing

After this PR: Define the DevicePlugin interface to group methods

Fixes: https://github.com/kubevirt/kubevirt/issues/11293#event-11910506102

Why we need it and why it was done in this way

The following tradeoffs were made:

Other common shared methods are found in device plugins such as cleanup, register.. etc. However I propose the interface to be as minimal as can be as to not make it difficult to implement on adding a new plugin. extra guidance on if any of these methods are needed in the interface may be required from the project maintainers .

Also there exists an interface called Device that implements the same thing in kubevirt/pkg/virt-handler/device-manager/generic_device.go. However its unclear whether its required to extend this interface as a part of this PR or create a new one. as this interface is already used in the code

Release note

Refactor device plugins to use a base plugin and define a common interface

aerosouund avatar Feb 24 '24 13:02 aerosouund

Hi @aerosouund. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot avatar Feb 24 '24 13:02 kubevirt-bot

Hi @aerosouund, thanks for start working on this. I take this is still a WIP, I suggest to move the PR to draft to avoid triggering the CI.

The CI 'dco' failure is due to lack of Signed-off-by, you can have that by git commit with -s option.

However its unclear whether its required to extend this interface as a part of this PR or create a new one. as this interface is already used in the code

The request in #11293 should be clear, that is:

The goal of this refactor is finding the shared functionality, proposing and creating a standard framework and interfaces for KubeVirt device plugins are the objectives of the project. This would simplify and eliminate redundant code, enhance testing, and make it easier to add new device plugins in the future.

So, if you look at a few of the device-plugins, you'll find some duplicated code being implemented in each device plugin. Reduce the duplicate code the best way you see fit.

victortoso avatar Feb 24 '24 20:02 victortoso

@victortoso I suggest we create a base struct and embed it in the specific device plugin structs. and implement the generic methods on this inner struct. If a specific device plugin needs to use its own version of this method it can just implement it itself. I have created an example if this on the MediatedDevicePlugin, if it looks good enough i can implement this for the other plugins as well. also it's unclear which methods should be on all of them. according to what i found i see this to be the methods in this interface

type DevicePlugin interface {
	Start(chan<- struct{}) error
	ListAndWatch(*pluginapi.Empty, pluginapi.DevicePlugin_ListAndWatchServer) error
	PreStartContainer(context.Context, *pluginapi.PreStartContainerRequest) (*pluginapi.PreStartContainerResponse, error)
	Allocate(context.Context, *pluginapi.AllocateRequest) (*pluginapi.AllocateResponse, error)
	HealthCheck() error
	StopDevicePlugin() error
	GetDeviceName() string
	GetInitialized() bool
	SetInitialized() bool
}

Please take a look if any ones extra should be added

aerosouund avatar Feb 25 '24 18:02 aerosouund

also it's unclear which methods should be on all of them

I haven't looked at all device-plugins but some parts should be very clear, like creating the unix socket and doing the registration of device-plugin with kubelet's device-manager.

If you are unsure, perhaps you can start small, only with the methods that are clear to have a lot of duplicated code. If you start with all possible methods, some device-plugins might not match with the proposed interface/struct and you would have to rethink it.

Please take a look if any ones extra should be added

I did a quick look and it looks promising :) I'll take a better look at it Tomorrow.

victortoso avatar Feb 25 '24 19:02 victortoso

hello @victortoso

I have reached on what i believe is a good enough version of this refactor on all device plugins. First of all i have made the interface to include only the exported methods by all plugins and made the base plugin follow the same convention on which methods to export. to avoid having the base plugin calling its own methods instead of the ones by the plugins themselves if they are using fields specific to that plugin.

Moreover i extended the refactor to the other plugins, please review and let me know if we need further changes on anything

aerosouund avatar Feb 27 '24 09:02 aerosouund

I think you need not to start by creating a new interface/structure. If we do a step back and look how the device-plugins are implemented, they are all part of the same device_manager package. Taking the USB device-plugin as example, who initializes it (e.g: by calling NewUSBDevicePlugin()) is the device-controller. There, the device plugin is managed as a Device interface.

I think the refactor could be made my moving code and common responsibilities out of each device-plugin implementation and into this Device interface. What do you think?

victortoso avatar Feb 27 '24 14:02 victortoso

Hey victor, thanks alot for showing me how this is tested. my only guiding metric was the compiler not complaining about missing method implementations on interfaces or invalid fields. but it appears further testing is required. i'm currently trying to set up the testing as you have specified

aerosouund avatar Mar 01 '24 18:03 aerosouund

@aerosouund thanks for working on this :). I'll try to review your work later this week. However, a couple of suggestions about the commits. Try to have clean commit history, for example, the commit with the revert isn't necessary. You can rebase and add the change to the right commit. Additionally, try to shortly describe what the commit does and adds. In order to check if your changes broke something try to run the unit tests. Here, my suggestion is to only run a subset of the test and you can achieve it by using:

$ WHAT="//pkg/virt-handler/device-manager//..." make test

Another useful trick, if you want to avoid additional noise is to add F (for focus) on a Context or It (Please, don't forget to remove it before committing and pushing). I hope this can help you

alicefr avatar Mar 05 '24 08:03 alicefr

@alicefr @victortoso

this has been fixed now, the newly defined file just wasn't included in bazel, check the changes on BUILD.bazel to see what i did. what else is needed to get this PR in an accepted state ? I see since you are already depending on the Device interface its not needed to define another DevicePlugin interface. we should just include the methods that are contained in the second to the first

aerosouund avatar Mar 07 '24 18:03 aerosouund

This by definition requires extension of the FakePlugin used in testing, let me know what you guys think. i edited the code to use the interface you already have

aerosouund avatar Mar 07 '24 18:03 aerosouund

what else is needed to get this PR in an accepted state ?

As Alice asked, a clean commit history.

This by definition requires extension of the FakePlugin used in testing, let me know what you guys think. i edited the code to use the interface you already have

Should be fine. We need unit testing. I think the nomenclature used is Mock Plugin instead.

victortoso avatar Mar 07 '24 19:03 victortoso

@aerosouund please, squash some commits together like explained in this thread: https://stackoverflow.com/questions/5189560/how-do-i-squash-my-last-n-commits-together

alicefr avatar Mar 08 '24 07:03 alicefr

Hello @alicefr Rewrote the history as you see, please let me know if this fits a linear narrative or you have any further suggestions

aerosouund avatar Mar 08 '24 08:03 aerosouund

@aerosouund please fix the commit body message in 4eb92005c6aefc7c412409da0a71d99b84eaa5cc and please squash 2a4dad1ec0e428b0fd49071d4ce556142b406ff7 into the first commit

alicefr avatar Mar 08 '24 12:03 alicefr

/test

alicefr avatar Mar 08 '24 12:03 alicefr

@alicefr: The /test command needs one or more targets. The following commands are available to trigger required jobs:

  • /test pull-kubevirt-apidocs
  • /test pull-kubevirt-build
  • /test pull-kubevirt-build-arm64
  • /test pull-kubevirt-check-unassigned-tests
  • /test pull-kubevirt-client-python
  • /test pull-kubevirt-code-lint
  • /test pull-kubevirt-e2e-k8s-1.27-sig-compute
  • /test pull-kubevirt-e2e-k8s-1.27-sig-network
  • /test pull-kubevirt-e2e-k8s-1.27-sig-operator
  • /test pull-kubevirt-e2e-k8s-1.27-sig-performance
  • /test pull-kubevirt-e2e-k8s-1.27-sig-storage
  • /test pull-kubevirt-e2e-k8s-1.28-sig-compute
  • /test pull-kubevirt-e2e-k8s-1.28-sig-network
  • /test pull-kubevirt-e2e-k8s-1.28-sig-operator
  • /test pull-kubevirt-e2e-k8s-1.28-sig-storage
  • /test pull-kubevirt-e2e-k8s-1.29-ipv6-sig-network
  • /test pull-kubevirt-e2e-k8s-1.29-sig-compute
  • /test pull-kubevirt-e2e-k8s-1.29-sig-compute-migrations
  • /test pull-kubevirt-e2e-k8s-1.29-sig-monitoring
  • /test pull-kubevirt-e2e-k8s-1.29-sig-network
  • /test pull-kubevirt-e2e-k8s-1.29-sig-operator
  • /test pull-kubevirt-e2e-k8s-1.29-sig-storage
  • /test pull-kubevirt-e2e-kind-1.27-vgpu
  • /test pull-kubevirt-e2e-kind-sriov
  • /test pull-kubevirt-e2e-windows2016
  • /test pull-kubevirt-fossa
  • /test pull-kubevirt-generate
  • /test pull-kubevirt-manifests
  • /test pull-kubevirt-prom-rules-verify
  • /test pull-kubevirt-unit-test
  • /test pull-kubevirt-verify-go-mod

The following commands are available to trigger optional jobs:

  • /test build-kubevirt-builder
  • /test pull-kubevirt-build-s390x
  • /test pull-kubevirt-check-tests-for-flakes
  • /test pull-kubevirt-conformance-arm64
  • /test pull-kubevirt-e2e-arm64
  • /test pull-kubevirt-e2e-k8s-1.29-sig-compute-realtime
  • /test pull-kubevirt-e2e-k8s-1.29-sig-compute-root
  • /test pull-kubevirt-e2e-k8s-1.29-sig-network-multus-v4
  • /test pull-kubevirt-e2e-k8s-1.29-sig-storage-root
  • /test pull-kubevirt-e2e-k8s-1.29-single-node
  • /test pull-kubevirt-e2e-k8s-1.29-swap-enabled
  • /test pull-kubevirt-e2e-k8s-1.30-sig-compute
  • /test pull-kubevirt-e2e-k8s-1.30-sig-network
  • /test pull-kubevirt-e2e-k8s-1.30-sig-operator
  • /test pull-kubevirt-e2e-k8s-1.30-sig-storage
  • /test pull-kubevirt-gosec
  • /test pull-kubevirt-goveralls
  • /test pull-kubevirt-metrics-lint
  • /test pull-kubevirt-unit-test-arm64
  • /test pull-kubevirt-verify-rpms

Use /test all to run the following jobs that were automatically triggered:

  • pull-kubevirt-apidocs
  • pull-kubevirt-build
  • pull-kubevirt-build-arm64
  • pull-kubevirt-check-tests-for-flakes
  • pull-kubevirt-check-unassigned-tests
  • pull-kubevirt-client-python
  • pull-kubevirt-code-lint
  • pull-kubevirt-conformance-arm64
  • pull-kubevirt-e2e-arm64
  • pull-kubevirt-e2e-k8s-1.27-sig-performance
  • pull-kubevirt-e2e-k8s-1.29-sig-compute
  • pull-kubevirt-e2e-k8s-1.29-sig-compute-migrations
  • pull-kubevirt-e2e-k8s-1.29-sig-network
  • pull-kubevirt-e2e-k8s-1.29-sig-operator
  • pull-kubevirt-e2e-k8s-1.29-sig-storage
  • pull-kubevirt-fossa
  • pull-kubevirt-generate
  • pull-kubevirt-goveralls
  • pull-kubevirt-manifests
  • pull-kubevirt-prom-rules-verify
  • pull-kubevirt-unit-test
  • pull-kubevirt-unit-test-arm64
  • pull-kubevirt-verify-go-mod

In response to this:

/test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot avatar Mar 08 '24 12:03 kubevirt-bot

/test all

alicefr avatar Mar 08 '24 12:03 alicefr

@aerosouund is this change needed 6485743476a81a3fae26d4d5b40993bebd9b2443 ? I don't see the new functions called in the commit.

alicefr avatar Mar 08 '24 12:03 alicefr

@aerosouund the git tree isn't complete clean, probably you forgot to commit all the file, that's why the CI is failing (see the pull-kubevirt-build output in Details). Please, fix also this otherwise we cannot verify if your changes are sane

alicefr avatar Mar 08 '24 12:03 alicefr

@aerosouund the code at the first glance seems fine to me, but I'd like to check if the tests are passing. The part that you could fix is the commits. This could be rearranged and clean a little bit as I already mentioned to you in the previous comments

alicefr avatar Mar 08 '24 13:03 alicefr

/retest

aerosouund avatar Mar 08 '24 20:03 aerosouund

@aerosouund: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot avatar Mar 08 '24 20:03 kubevirt-bot

@aerosouund is this change needed 6485743 ? I don't see the new functions called in the commit.

they are not, but this is because the FakePlugin must implement the Device interface and the goal of this refactor is to enhance this interface. Whats your take on this ? @alicefr

aerosouund avatar Mar 08 '24 22:03 aerosouund

/ok-to-test

alicefr avatar Mar 12 '24 14:03 alicefr

/test all

alicefr avatar Mar 12 '24 14:03 alicefr

@aerosouund please add the release note in the PR description

alicefr avatar Mar 12 '24 14:03 alicefr

@victortoso when you have a bit of time, can you also have a look. The code looks already pretty good

alicefr avatar Mar 12 '24 14:03 alicefr

/test pull-kubevirt-build

alicefr avatar Mar 12 '24 15:03 alicefr

The problem should be make generate

./hack/verify-generate.sh
ERROR: git tree state is not clean!
You probably need to run 'make generate' or 'make' and commit the changes 

indeed, running it leaves:

diff --git a/pkg/virt-handler/device-manager/BUILD.bazel b/pkg/virt-handler/device-manager/BUILD.bazel
index 0086930cf..322ce4ece 100644
--- a/pkg/virt-handler/device-manager/BUILD.bazel
+++ b/pkg/virt-handler/device-manager/BUILD.bazel
@@ -7,10 +7,10 @@ go_library(
     srcs = [
         "common.go",
         "device_controller.go",
+        "device_plugin_base.go",
         "generated_mock_common.go",
         "generic_device.go",
         "mediated_device.go",
-        "device_plugin_base.go",
         "mediated_devices_types.go",
         "pci_device.go",
         "socket_device.go",
diff --git a/pkg/virt-handler/device-manager/device_plugin_base.go b/pkg/virt-handler/device-manager/device_plugin_base.go
index b5f0d6c17..adfa76837 100644
--- a/pkg/virt-handler/device-manager/device_plugin_base.go
+++ b/pkg/virt-handler/device-manager/device_plugin_base.go
@@ -14,6 +14,7 @@ import (
        "github.com/fsnotify/fsnotify"
        "google.golang.org/grpc"
        "kubevirt.io/client-go/log"
+
        pluginapi "kubevirt.io/kubevirt/pkg/virt-handler/device-manager/deviceplugin/v1beta1"
 )

victortoso avatar Mar 12 '24 18:03 victortoso

 pkg/virt-handler/device-manager/mediated_device.go | 198 +++++++++++++-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 pkg/virt-handler/device-manager/pci_device.go      | 197 +++++++++++++----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 pkg/virt-handler/device-manager/socket_device.go   |  92 +++++++-------------------------------------------------------------------------------------
 pkg/virt-handler/device-manager/usb_device.go      |  96 ++++++++++++------------------------------------------------------------------------------------

ah.... so good. With CI green, which also validates other device plugins, we can take this as is.

Another possible improvement could be with DevicePluginBase initialization. There are several fields there and not all are used by all device plugins so we could use Functional Options pattern for its initialization.

I'll give you quick suggestions for the commit messages too:

  • prefix the shortlog (first line) with device-manager:
  • add some information/context to the commit log, this helps someone that ends up in your commit to know what it is about without reading the code or finding this issue.

In the last commit, add an explanation why it is necessary: would the test fail/break?

victortoso avatar Mar 12 '24 18:03 victortoso