intel-device-plugins-for-kubernetes icon indicating copy to clipboard operation
intel-device-plugins-for-kubernetes copied to clipboard

FPGA: support CDI

Open bart0sh opened this issue 1 year ago • 10 comments

This PR uses CDI support for Device Plugins to implement FPGA programming hooks.

CRI-O hooks are no longer needed and removed.

As CDI is currently supported by CRI-O and Containerd, this PR enables FPGA orchestration programmed operation mode for both most used CRI runtimes: CRI-O and Containerd. Previously it was only supported by CRI-O.

Ref: https://github.com/intel/intel-device-plugins-for-kubernetes/issues/1457

bart0sh avatar May 22 '24 09:05 bart0sh

@bart0sh thanks! quick question: I believe the word prestart is deprecated in the OCI runtime spec. would it make sense to get the new functionality to follow the official name createRuntime?

mythi avatar May 22 '24 14:05 mythi

@bart0sh thanks! quick question: I believe the word prestart is deprecated in the OCI runtime spec. would it make sense to get the new functionality to follow the official name createRuntime?

Thanks for pointing out! I didn't know that. Will try to replace.

bart0sh avatar May 22 '24 15:05 bart0sh

@mythi renamed in the code. Will update docs if/when all tests pass.

bart0sh avatar May 22 '24 15:05 bart0sh

@mythi done: https://github.com/intel/intel-device-plugins-for-kubernetes/pull/1745/commits/e58369ed13389b4971d3ae4e22914c3a98b8ad51

bart0sh avatar May 22 '24 16:05 bart0sh

@mythi done: e58369e

looks good!

mythi avatar May 23 '24 05:05 mythi

@tkatila Can you review this PR please?

bart0sh avatar May 23 '24 09:05 bart0sh

@tkatila

As this change removes the old prestart hook approach, should the README note that if one still wants to use that way, he or she must use <=0.30.0 version of the FPGA components?

New approach designed to be a replacement for the old one. It shouldn't require any configuration or other changes comparing to the previous one. I'd consider it as an internal change and wouldn't add anything to the README as it might confuse users. However, if you think it's needed, I'd be happy to do that. Just let me know.

bart0sh avatar May 24 '24 08:05 bart0sh

New approach designed to be a replacement for the old one. It shouldn't require any configuration or other changes comparing to the previous one. I'd consider it as an internal change and wouldn't add anything to the README as it might confuse users. However, if you think it's needed, I'd be happy to do that. Just let me know.

In that case, I don't think it's required.

tkatila avatar May 24 '24 08:05 tkatila

@uniemimu and/or @hj-johannes-lee can you review this? Seems to require one more approval.

tkatila avatar May 24 '24 08:05 tkatila

It shouldn't require any configuration or other changes comparing to the previous one.

@bart0sh has it also been unconditionally enabled in kubelet since it was added?

mythi avatar May 24 '24 10:05 mythi

@mythi yes, it's graduated to Beta in Kubernetes in 1.29 and since then it's enabled by default.

bart0sh avatar May 27 '24 15:05 bart0sh

@mythi yes, it's graduated to Beta in Kubernetes in 1.29 and since then it's enabled by default.

In section "Configuring CRI runtimes" we add some comments how the feature is available. We don't say much about kubelet itself. In theory there's a gap but I don't think it's important at all since by the time we release this, the oldest k8s we "support" is 1.29 which has it enabled.

mythi avatar May 28 '24 06:05 mythi

@mythi If we'll release this earlier, I'll add a notice about Kubelet version to the documentation as a separate PR.

@mythi @tkatila Is anything else still needed to merge this PR? I'm asking that this is a show-stopper for graduating the feature to GA in Kubernetes.

bart0sh avatar May 28 '24 06:05 bart0sh

Nothing from my side. Good to go.

tkatila avatar May 28 '24 06:05 tkatila

@kad

in documentation part, write down explicitly requirements for CDI mode: k8s,containerd/cri-o.

CRI-O and Containerd are mentioned here. Would it make sense to add version info for CRI-O, Containerd and Kubernetes there? They only make sense for the hook, so it looks like a good place for me. WDYT?

bart0sh avatar May 28 '24 10:05 bart0sh

@kad

in documentation part, write down explicitly requirements for CDI mode: k8s,containerd/cri-o.

CRI-O and Containerd are mentioned here. Would it make sense to add version info for CRI-O, Containerd and Kubernetes there? They only make sense for the hook, so it looks like a good place for me. WDYT?

there or in overall FPGA plugin README, stating that programming mode is available for systems with CDI enabled, which means k8s+containerd/cri-o of not less than....

kad avatar May 28 '24 11:05 kad