nvidia-vaapi-driver icon indicating copy to clipboard operation
nvidia-vaapi-driver copied to clipboard

CUDA_DISABLE_PERF_BOOST=1 is not actually equivalent to CUDA-Force P2 State = off

Open HanChinese-LIURUI opened this issue 4 weeks ago • 7 comments

In Linux, set the environment variable CUDA_DISABLE_PERF_BOOST to 1. However, when using CUDA programs, the performance is still limited to P2 level.

HanChinese-LIURUI avatar Dec 02 '25 08:12 HanChinese-LIURUI

Could you please provide concrete evidence for this claim. Right now it’s not reproducible.

I need the following:

  1. Exact GPU, driver version, CUDA version, and kernel version.
  2. nvidia-smi output showing the power state staying at P2 during your test.
  3. The exact command you ran and the environment variables you set.
  4. A minimal reproducible example (small CUDA program or workload) where this happens would be helpful
  5. Any benchmark numbers that demonstrate the alleged performance cap.

Without logs or a reproducible case, there’s nothing actionable here @HanChinese-LIURUI

ManuLinares avatar Dec 06 '25 07:12 ManuLinares

https://forums.developer.nvidia.com/t/cuda-p2-forced-state-from-drivers/353276 This is my comment on the official website. After my test, the setting of CUDA_DISABLE_PERF_BOOST can only prevent the GPU from being forced to the P2 state during audio and video processing. However, when running deep learning inference tasks or pure CUDA programs, it still causes the GPU to be forced to the P2 state.

发自我的iPhone

------------------ Original ------------------ From: Manu Linares @.> Date: Sat,Dec 6,2025 3:39 PM To: elFarto/nvidia-vaapi-driver @.> Cc: 刘锐 @.>, Mention @.> Subject: Re: [elFarto/nvidia-vaapi-driver] CUDA_DISABLE_PERF_BOOST=1 is notactually equivalent to CUDA-Force P2 State = off (Issue #411)

ManuLinares left a comment (elFarto/nvidia-vaapi-driver#411)

Could you please provide concrete evidence for this claim. Right now it’s not reproducible.

I need the following:

Exact GPU, driver version, CUDA version, and kernel version.

nvidia-smi output showing the power state staying at P2 during your test.

The exact command you ran and the environment variables you set.

A minimal reproducible example (small CUDA program or workload) where this happens would be helpful

Any benchmark numbers that demonstrate the alleged performance cap.

Without logs or a reproducible case, there’s nothing actionable here @HanChinese-LIURUI

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

HanChinese-LIURUI avatar Dec 06 '25 11:12 HanChinese-LIURUI

While I can believe that it still imposes the P2 maximum level behaviour that some people find problematic, that is not an issue for this vaapi driver - and this readme isn't the primary way someone running machine learning workloads is going to discover the feature.

philipl avatar Dec 06 '25 15:12 philipl

nothing actionable here, spam

ManuLinares avatar Dec 06 '25 16:12 ManuLinares

Yes, on Windows, you can use the NVIDIA Profile Inspector tool to set CUDA - Force P2 State=off, enabling the GPU to run in P0 state during deep learning inference. This stabilizes the inference latency. While some posts claim that inference under P0 state may cause errors or crashes, the scenario I'm using doesn't require high precision. As for crashes, I haven't encountered any on Windows so far. Therefore, I can't understand why the official defaults CUDA to P2 state—this state leads to unstable latency in my inference program.

发自我的iPhone

------------------ Original ------------------ From: Philip Langdale @.> Date: Sat,Dec 6,2025 11:09 PM To: elFarto/nvidia-vaapi-driver @.> Cc: 刘锐 @.>, Mention @.> Subject: Re: [elFarto/nvidia-vaapi-driver] CUDA_DISABLE_PERF_BOOST=1 is notactually equivalent to CUDA-Force P2 State = off (Issue #411)

philipl left a comment (elFarto/nvidia-vaapi-driver#411)

While I can believe that it still imposes the P2 maximum level behaviour that some people find problematic, that is not an issue for this vaapi driver - and this readme isn't the primary way someone running machine learning workloads is going to discover the feature.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

HanChinese-LIURUI avatar Dec 07 '25 00:12 HanChinese-LIURUI

How about this method? https://github.com/elFarto/nvidia-vaapi-driver/issues/74#issuecomment-3165300189

detiam avatar Dec 07 '25 07:12 detiam

How about this method? #74 (comment)

This is the configuration I'm currently trying.

HanChinese-LIURUI avatar Dec 07 '25 09:12 HanChinese-LIURUI