CUDA_DISABLE_PERF_BOOST=1 is not actually equivalent to CUDA-Force P2 State = off
In Linux, set the environment variable CUDA_DISABLE_PERF_BOOST to 1. However, when using CUDA programs, the performance is still limited to P2 level.
Could you please provide concrete evidence for this claim. Right now it’s not reproducible.
I need the following:
- Exact GPU, driver version, CUDA version, and kernel version.
nvidia-smioutput showing the power state staying at P2 during your test.- The exact command you ran and the environment variables you set.
- A minimal reproducible example (small CUDA program or workload) where this happens would be helpful
- Any benchmark numbers that demonstrate the alleged performance cap.
Without logs or a reproducible case, there’s nothing actionable here @HanChinese-LIURUI
https://forums.developer.nvidia.com/t/cuda-p2-forced-state-from-drivers/353276 This is my comment on the official website. After my test, the setting of CUDA_DISABLE_PERF_BOOST can only prevent the GPU from being forced to the P2 state during audio and video processing. However, when running deep learning inference tasks or pure CUDA programs, it still causes the GPU to be forced to the P2 state.
发自我的iPhone
------------------ Original ------------------ From: Manu Linares @.> Date: Sat,Dec 6,2025 3:39 PM To: elFarto/nvidia-vaapi-driver @.> Cc: 刘锐 @.>, Mention @.> Subject: Re: [elFarto/nvidia-vaapi-driver] CUDA_DISABLE_PERF_BOOST=1 is notactually equivalent to CUDA-Force P2 State = off (Issue #411)
ManuLinares left a comment (elFarto/nvidia-vaapi-driver#411)
Could you please provide concrete evidence for this claim. Right now it’s not reproducible.
I need the following:
Exact GPU, driver version, CUDA version, and kernel version.
nvidia-smi output showing the power state staying at P2 during your test.
The exact command you ran and the environment variables you set.
A minimal reproducible example (small CUDA program or workload) where this happens would be helpful
Any benchmark numbers that demonstrate the alleged performance cap.
Without logs or a reproducible case, there’s nothing actionable here @HanChinese-LIURUI
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
While I can believe that it still imposes the P2 maximum level behaviour that some people find problematic, that is not an issue for this vaapi driver - and this readme isn't the primary way someone running machine learning workloads is going to discover the feature.
nothing actionable here, spam
Yes, on Windows, you can use the NVIDIA Profile Inspector tool to set CUDA - Force P2 State=off, enabling the GPU to run in P0 state during deep learning inference. This stabilizes the inference latency. While some posts claim that inference under P0 state may cause errors or crashes, the scenario I'm using doesn't require high precision. As for crashes, I haven't encountered any on Windows so far. Therefore, I can't understand why the official defaults CUDA to P2 state—this state leads to unstable latency in my inference program.
发自我的iPhone
------------------ Original ------------------ From: Philip Langdale @.> Date: Sat,Dec 6,2025 11:09 PM To: elFarto/nvidia-vaapi-driver @.> Cc: 刘锐 @.>, Mention @.> Subject: Re: [elFarto/nvidia-vaapi-driver] CUDA_DISABLE_PERF_BOOST=1 is notactually equivalent to CUDA-Force P2 State = off (Issue #411)
philipl left a comment (elFarto/nvidia-vaapi-driver#411)
While I can believe that it still imposes the P2 maximum level behaviour that some people find problematic, that is not an issue for this vaapi driver - and this readme isn't the primary way someone running machine learning workloads is going to discover the feature.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
How about this method? https://github.com/elFarto/nvidia-vaapi-driver/issues/74#issuecomment-3165300189