resources icon indicating copy to clipboard operation
resources copied to clipboard

Support for NPU devices

Open adrianboguszewski opened this issue 1 year ago • 6 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Is your feature request related to a problem? Please describe.

No response

Describe the solution you'd like

Many modern CPUs e.g. Intel Core Ultra 7 155H include integrated NPUs (Neural Processing Units). It would be nice to display the utilization of these devices in Resources like in Windows Task Manager.

Describe alternatives you've considered

No response

Additional context

image

adrianboguszewski avatar Jul 22 '24 13:07 adrianboguszewski

Hi, thanks for the issue. I believe you forgot to change the name when copy-pasting, this repo isn't quite Mission Center. ^^ If Intel exposes those statistics, it should be fairly straight-forward to implement that (hopefully), I'll see what I can do and what the kernel offers. :)

nokyan avatar Jul 22 '24 15:07 nokyan

@nokyan Oh, you're right, changed :)

@jwludzik do we expose that kind of information (e.g. utilization) in the driver?

adrianboguszewski avatar Jul 23 '24 11:07 adrianboguszewski

Linux NPU driver repo: https://github.com/intel/linux-npu-driver

adrianboguszewski avatar Jul 23 '24 18:07 adrianboguszewski

Hello @adrianboguszewski @nokyan,

I am going to prepare an example how to measure NPU utilization for next week.

m-falkowski avatar Jul 26 '24 14:07 m-falkowski

Hi @adrianboguszewski @nokyan,

NPU utilization may be calculated using device's sysfs file npu_busy_time_us that contains the time that the device spent executing jobs. NPU is considered 'busy' starting with a first job submitted to firmware and ending when there is no more jobs pending/executing.

To measure an utilization either calculate from npu_busy_time_us difference delta to see NPU active duration during workload or monitor utilization percentage by reading npu_busy_time_us periodically.

You may see the commit introducing this feature: accel/ivpu: Share NPU busy time in sysfs

This is a sample Bash code that should showcase a usage of utilization:

NPU_BUSY_TIME_PATH="/sys/devices/pci0000:00/0000:00:0b.0/npu_busy_time_us"
TIME_1=$(cat "$NPU_BUSY_TIME_PATH")
while true; do
 sleep "$SAMPLING_PERIOD"
 TIME_2=$(cat "$NPU_BUSY_TIME_PATH")
 clear
 DELTA=$(("$TIME_2" - "$TIME_1"))
 echo "NPU busy time: $TIME_2 us"
 echo "NPU busy time delta: $DELTA us"
 echo "NPU Utilization: $(( 100 * "$DELTA" / "$SAMPLING_PERIOD" / 1000000 ))%"
 TIME_1=$TIME_2
done

m-falkowski avatar Aug 12 '24 15:08 m-falkowski

@m-falkowski Thanks a lot! I'm afraid NPU support won't make it in Resources 1.6 though which will be released in about two weeks. I'll start implementing it after the release. :)

nokyan avatar Aug 12 '24 16:08 nokyan

Hi there, I've finally gotten around to implementing NPU support. Do you mind checking out the npu branch and see if it works?

nokyan avatar Oct 10 '24 18:10 nokyan

Hi there, I've finally gotten around to implementing NPU support. Do you mind checking out the npu branch and see if it works?

Hi @nokyan ,

Thank you for this change! I tested it and it works reliably but calculates the percentage utilization incorrectly. The utilization was not scaled so it was shown it thousands of percents.

https://github.com/nokyan/resources/blob/npu/src/utils/npu/intel.rs#L67-L82

    fn usage(&self) -> Result<f64> {
        let last_timestamp = self.last_busy_time_timestamp.get();
        let last_busy_time = self.last_busy_time_us.get();

        let new_timestamp = unix_as_millis();
        let new_busy_time = self
            .read_device_int("npu_busy_time_us")
            .map(|int| int as usize)?;

        self.last_busy_time_timestamp.set(new_timestamp);
        self.last_busy_time_us.set(new_busy_time);

        let delta_timestamp = new_timestamp.saturating_sub(last_timestamp) as f64;
        let delta_busy_time = new_busy_time.saturating_sub(last_busy_time) as f64;

        Ok(delta_busy_time / delta_timestamp)
    }

Changing the last division into Ok((delta_busy_time / delta_timestamp) / 1000.0) resolves the issue and utilization as shown below:

utilization

m-falkowski avatar Oct 18 '24 16:10 m-falkowski

Hi there, I've finally gotten around to implementing NPU support. Do you mind checking out the npu branch and see if it works?

Hi @nokyan ,

Thank you for this change! I tested it and it works reliably but calculates the percentage utilization incorrectly. The utilization was not scaled so it was shown it thousands of percents.

https://github.com/nokyan/resources/blob/npu/src/utils/npu/intel.rs#L67-L82

    fn usage(&self) -> Result<f64> {
        let last_timestamp = self.last_busy_time_timestamp.get();
        let last_busy_time = self.last_busy_time_us.get();

        let new_timestamp = unix_as_millis();
        let new_busy_time = self
            .read_device_int("npu_busy_time_us")
            .map(|int| int as usize)?;

        self.last_busy_time_timestamp.set(new_timestamp);
        self.last_busy_time_us.set(new_busy_time);

        let delta_timestamp = new_timestamp.saturating_sub(last_timestamp) as f64;
        let delta_busy_time = new_busy_time.saturating_sub(last_busy_time) as f64;

        Ok(delta_busy_time / delta_timestamp)
    }

Changing the last division into Ok((delta_busy_time / delta_timestamp) / 1000.0) resolves the issue and utilization as shown below:

utilization

Looks good! I'll fix that soon. Unfortunately the driver doesn't seem to expose anything but the compute utilization, no memory usage or frequencies yet. :/

nokyan avatar Oct 18 '24 16:10 nokyan

@nokyan I'm happy to see that :) When can we expect the next release with this feature on the board?

adrianboguszewski avatar Oct 23 '24 12:10 adrianboguszewski

@nokyan I'm happy to see that :) When can we expect the next release with this feature on the board?

I plan to release 1.7 with NPU support probably on 29 November :)

nokyan avatar Oct 23 '24 12:10 nokyan

Hi there, I've finally gotten around to implementing NPU support. Do you mind checking out the npu branch and see if it works?

Hi @nokyan ,

Thank you for this change! I tested it and it works reliably but calculates the percentage utilization incorrectly. The utilization was not scaled so it was shown it thousands of percents.

https://github.com/nokyan/resources/blob/npu/src/utils/npu/intel.rs#L67-L82

    fn usage(&self) -> Result<f64> {
        let last_timestamp = self.last_busy_time_timestamp.get();
        let last_busy_time = self.last_busy_time_us.get();

        let new_timestamp = unix_as_millis();
        let new_busy_time = self
            .read_device_int("npu_busy_time_us")
            .map(|int| int as usize)?;

        self.last_busy_time_timestamp.set(new_timestamp);
        self.last_busy_time_us.set(new_busy_time);

        let delta_timestamp = new_timestamp.saturating_sub(last_timestamp) as f64;
        let delta_busy_time = new_busy_time.saturating_sub(last_busy_time) as f64;

        Ok(delta_busy_time / delta_timestamp)
    }

Changing the last division into Ok((delta_busy_time / delta_timestamp) / 1000.0) resolves the issue and utilization as shown below:

utilization

How can we have the NPU listed in the performance layout? My Ubuntu 24.04 does not even identify the device

Martin-HZK avatar Oct 25 '24 12:10 Martin-HZK

How can we have the NPU listed in the performance layout? My Ubuntu 24.04 does not even identify the device

I can't help you without any information. What NPU do you have? Are you trying out the latest commit on GitHub or the current release on Flathub? The current release on Flathub doesn't have this feature yet, it will be included with the next release. What kernel do you currently use? Please send me the output of uname -a in your terminal.

nokyan avatar Oct 25 '24 13:10 nokyan

How can we have the NPU listed in the performance layout? My Ubuntu 24.04 does not even identify the device

I can't help you without any information. What NPU do you have? Are you trying out the latest commit on GitHub or the current release on Flathub? The current release on Flathub doesn't have this feature yet, it will be included with the next release. What kernel do you currently use? Please send me the output of uname -a in your terminal.

Thank you for your timely reply! The detailed system information are as follows:

Currently I am using Intel Corporation Meteor Lake NPU and the driver I installed is Linux NPU Driver v1.8.0 presented for release on GitHub. The OS I installed is Ubuntu 24.04 LTS

$ uname -a
Linux hzk-Martin 6.8.1+ #2 SMP PREEMPT_DYNAMIC Thu Oct 24 18:31:43 CST 2024 x86_64 x86_64 x86_64 GNU/Linux

Martin-HZK avatar Oct 25 '24 13:10 Martin-HZK

Are you using the current production release of Resources on Flathub or are you building and using the latest commit from GitHub?

nokyan avatar Oct 25 '24 13:10 nokyan

Are you using the current production release of Resources on Flathub or are you building and using the latest commit from GitHub?

I have strictly adhere to the installation guideline on GitHub with its corresponding release version

Martin-HZK avatar Oct 25 '24 13:10 Martin-HZK

Could you run Resources from your terminal with the environment variable RUST_LOG=resources=debug set and send me the output?

nokyan avatar Oct 25 '24 13:10 nokyan

Sry for my mistake. Currently the exact 'Resources' application is downloaded from Flatpak.

Martin-HZK avatar Oct 25 '24 13:10 Martin-HZK

Sry for my mistake. Currently the exact 'Resources' application is downloaded from Flatpak.

Thanks, no worries. The current release of Resources on Flathub does not have the NPU feature yet.

nokyan avatar Oct 25 '24 13:10 nokyan

Sry for my mistake. Currently the exact 'Resources' application is downloaded from Flatpak.

Thanks, no worries. The current release of Resources on Flathub does not have the NPU feature yet.

So does any of the releases in GitHub support NPU? I didn't see any of the releases mentions NPU. Or which commit hash code will you recommend for NPU profiling?

Martin-HZK avatar Oct 25 '24 14:10 Martin-HZK

Sry for my mistake. Currently the exact 'Resources' application is downloaded from Flatpak.

Thanks, no worries. The current release of Resources on Flathub does not have the NPU feature yet.

So does any of the releases in GitHub support NPU? I didn't see any of the releases mentions NPU. Or which commit hash code will you recommend for NPU profiling?

The NPU support is already in the main branch, just not in a production release yet. The branch for NPU support has been merged in commit 5583d7d64d6b0f8d4ee0a5b639505a887341b462. In the README you can find instructions on how to build the latest commit of Resources yourself. It boils down to cloning the repo and running this in your terminal while being in the repo's root:

flatpak install org.gnome.Sdk//47 org.freedesktop.Sdk.Extension.rust-stable//24.08 org.gnome.Platform//47 org.freedesktop.Sdk.Extension.llvm18//24.08
flatpak-builder --user flatpak_app build-aux/net.nokyan.Resources.Devel.json
flatpak-builder --run flatpak_app build-aux/net.nokyan.Resources.Devel.json resources

Release 1.7, which will ship NPU support, will release most likely on 29 November to Flathub and of course will also get a GitHub tag and release.

nokyan avatar Oct 25 '24 15:10 nokyan

Sry for my mistake. Currently the exact 'Resources' application is downloaded from Flatpak.

Thanks, no worries. The current release of Resources on Flathub does not have the NPU feature yet.

So does any of the releases in GitHub support NPU? I didn't see any of the releases mentions NPU. Or which commit hash code will you recommend for NPU profiling?

The NPU support is already in the main branch, just not in a production release yet. The branch for NPU support has been merged in commit 5583d7d64d6b0f8d4ee0a5b639505a887341b462. In the README you can find instructions on how to build the latest commit of Resources yourself. It boils down to cloning the repo and running this in your terminal while being in the repo's root:

flatpak install org.gnome.Sdk//47 org.freedesktop.Sdk.Extension.rust-stable//24.08 org.gnome.Platform//47 org.freedesktop.Sdk.Extension.llvm18//24.08
flatpak-builder --user flatpak_app build-aux/net.nokyan.Resources.Devel.json
flatpak-builder --run flatpak_app build-aux/net.nokyan.Resources.Devel.json resources

Release 1.7, which will ship NPU support, will release most likely on 29 November to Flathub and of course will also get a GitHub tag and release.

The newest commit returns result like this:

image

Why does this happened? It seems that the NPU info is not successfully collected for commit 215dd36

Martin-HZK avatar Oct 25 '24 16:10 Martin-HZK

You are running Linux 6.8, which I believe does not contain the support for the sysfs interface that allows Resources to track usage for Intel NPUs. You could try updating your kernel. It could also be that Ubuntu 24.04 just does not offer kernel that's new enough for this feature. In this case, you could either try upgrading to Ubuntu 24.10 or try to manually install a newer kernel, though the latter can be risky if you're not experienced with that.

nokyan avatar Oct 25 '24 16:10 nokyan

Does that mean I cannot even access the NPU services with rebuilding the current linux kernel?

Martin-HZK avatar Oct 25 '24 16:10 Martin-HZK

Yes. The Ubuntu releases is fixed to kernel version, see "Kernel release schedule " in https://ubuntu.com/kernel/lifecycle. The NPU utilization solution used in nokyan/resource (many thanks to @nokyan) utilize the patch 0adff3b0ef12483a79dc8415b94547853d26d1f3 that has been merged to Linux kernel v6.11. I am sorry for the inconvenience. The main recommendation is to use Ubuntu 24.10 as @nokyan mentioned previously

jwludzik avatar Oct 25 '24 17:10 jwludzik

Yes. The Ubuntu releases is fixed to kernel version, see "Kernel release schedule " in https://ubuntu.com/kernel/lifecycle. The NPU utilization solution used in nokyan/resource (many thanks to @nokyan) utilize the patch 0adff3b0ef12483a79dc8415b94547853d26d1f3 that has been merged to Linux kernel v6.11. I am sorry for the inconvenience. The main recommendation is to use Ubuntu 24.10 as @nokyan mentioned previously

Thank you for your advice! The problem is solved with upgrading the kernel version to 6.11

Martin-HZK avatar Oct 26 '24 03:10 Martin-HZK

Wow! It looks super cool! Screenshot From 2024-10-30 14-43-51 Thanks, @nokyan for implementing that!

adrianboguszewski avatar Oct 30 '24 13:10 adrianboguszewski

Wow! It looks super cool! Screenshot From 2024-10-30 14-43-51 Thanks, @nokyan for implementing that!

I'm glad you like it!

nokyan avatar Oct 30 '24 15:10 nokyan