HAMi icon indicating copy to clipboard operation
HAMi copied to clipboard

bump github.com/NVIDIA/k8s-device-plugin to v0.15.0

Open morlay opened this issue 1 year ago • 13 comments

What type of PR is this?

What this PR does / why we need it:

updates to support WSL. https://github.com/NVIDIA/k8s-device-plugin/issues/646

Which issue(s) this PR fixes: No

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

morlay avatar May 06 '24 06:05 morlay

Hi @morlay, Thanks for your pull request! If the PR is ready, use the /auto-cc command to assign Reviewer to Review. We will review it shortly.

Details

Instructions for interacting with me using comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the gh-ci-bot repository.

github-actions[bot] avatar May 06 '24 06:05 github-actions[bot]

/auto-cc

morlay avatar May 06 '24 06:05 morlay

Thanks for the contribution. Can you upload the test results you ran after building the new version?

At present, HAMi does not have a complete e2e system, but it is planned

wawa0210 avatar May 07 '24 02:05 wawa0210

Thanks for the contribution. Can you upload the test results you ran after building the new version?

It works well with my custom image build (as multi arch image).

image

WSL is not tested.

I will test arm64 when the device ready.

Btw. v2.3.10 monitor container panic with (so i patched from v2.3.9)

I0507 02:42:26.842928 3374747 metrics.go:324] Initializing metrics for vGPUmonitor
I0507 02:42:26.843779 3374747 pathmonitor.go:159] server listening at [::]:9395
I0507 02:42:31.914854 3374747 pathmonitor.go:126] Adding ctr dirname /usr/local/vgpu/containers/2c9e5e40-317d-4daf-8b24-14bff598fb3b_svc in monitorpath
I0507 02:42:31.914885 3374747 pathmonitor.go:56] Checking path /usr/local/vgpu/containers/2c9e5e40-317d-4daf-8b24-14bff598fb3b_svc
unexpected fault address 0x7f0cb036073c
fatal error: fault
[signal SIGBUS: bus error code=0x2 addr=0x7f0cb036073c pc=0x16219b1]

goroutine 75 [running]:
runtime.throw({0x1a7037b?, 0x7f0cb023c000?})
	/usr/local/go/src/runtime/panic.go:1077 +0x5c fp=0xc00046db48 sp=0xc00046db18 pc=0x448cfc
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:858 +0x116 fp=0xc00046dba8 sp=0xc00046db48 pc=0x45ee36
main.mmapcachefile({0xc0000261c0, 0x6e}, 0xc00046dd68)
	/k8s-vgpu/cmd/vGPUmonitor/cudevshr.go:146 +0x151 fp=0xc00046dc90 sp=0xc00046dba8 pc=0x16219b1
main.getvGPUMemoryInfo(0xc00046dd68)
	/k8s-vgpu/cmd/vGPUmonitor/cudevshr.go:154 +0x31 fp=0xc00046dcc0 sp=0xc00046dc90 pc=0x1621b51
main.checkfiles({0xc0008895e0, 0x43})
	/k8s-vgpu/cmd/vGPUmonitor/pathmonitor.go:79 +0x23b fp=0xc00046dd98 sp=0xc00046dcc0 pc=0x16255bb
main.monitorpath(0x12a05f200?)
	/k8s-vgpu/cmd/vGPUmonitor/pathmonitor.go:127 +0x3ba fp=0xc00046df60 sp=0xc00046dd98 pc=0x1625cba
main.watchAndFeedback()
	/k8s-vgpu/cmd/vGPUmonitor/feedback.go:277 +0x91 fp=0xc00046dfe0 sp=0xc00046df60 pc=0x16229f1
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00046dfe8 sp=0xc00046dfe0 pc=0x47afa1
created by main.main in goroutine 1
	/k8s-vgpu/cmd/vGPUmonitor/main.go:36 +0x137

morlay avatar May 07 '24 02:05 morlay

/lgtm

/cc @archlitchi

wawa0210 avatar May 10 '24 03:05 wawa0210

/approve

wawa0210 avatar May 17 '24 02:05 wawa0210

/approve cancel

wawa0210 avatar May 17 '24 02:05 wawa0210

/approve

wawa0210 avatar May 17 '24 02:05 wawa0210

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: morlay, wawa0210

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

hami-robott[bot] avatar May 17 '24 02:05 hami-robott[bot]

/kind cleanup

calvin0327 avatar May 17 '24 06:05 calvin0327

please resolve these conflicting files, and we are ready to merge

archlitchi avatar Jul 01 '24 08:07 archlitchi

@archlitchi done.

morlay avatar Jul 03 '24 05:07 morlay

/lgtm

archlitchi avatar Jul 25 '24 06:07 archlitchi