[Feature Request] GPU waittime between frames
Hi,
It would be great if we could have a column that gives time between gpu finishing a frame and starting work on the next frame.
(cpustarttime+gpulatency)-(PREVcpustarttime+PREVdisplaylatency)
or to avoid problems when there is composition going on
(cpustarttime+gpulatency)-(PREVcpustarttime+PREVgpulatency+PREVgpubusy+PREVgpuwait)
This would be useful to determine wether one is cpu or gpu bound. A value of 0 is gpu bound, a value greater than 0 cpu bound. I think this beats other methods of determining gpu vs cpu limited scenarios, like comparing frametime to gpubusy which is not very precise.
Writing the CSV is a significant portion of PresentMon's power/performance overhead so I try to be careful there: while not a strict rule, I prefer not to add columns that are easily computed from existing data, unless they are essentially always needed.
This proposed metric will only help for cases the GPU is idle between the last and first command packet, which is not always/generally the case, and it's not clear GPU waits there are any more important than waits elsewhere. Do you have some examples that clarify how the above metric is helpful beyond the existing GPUWait?
The metric would show, whether you are in a cpu limited or gpu limited scenario. If the gpu waits 1ms between finishing a frame and starting work on the next, it means the gpu is waiting(1ms long) on the cpu, so the scenario is definitely cpu limited. If on the other hand the time between the gpu finishing one frame and starting the next is 0, that means the gpu is constantly fed and your framerate is limited by the gpu.
There are other ways to determine whether your cpu or gpu bound. In the intel presentmon beta they visulaize it by making a graph where frametime and gpu busy is overlayed on the same graph. If gpu busy is greater than frametime, it's an indicator that you are gpu limited. This method is imprecise because the gpu doesn't always start it's work at the same time(for example sometimes gpu starts work before, sometimes after present). Even if gpu busy is slightly below frametime, you can still be gpu limited. I confirmed this by comparing with frameview, which shows "renderqueuedepth" -another way to see if your gpu or cpu capped(renderqueuedepth above 1.0 means gpu is the limiting factor).
The proposed metric would be very accurate way of determining if you are cpu/gpu limited, and if added to presentmon it could be used in overlays to see it live.
But, what if the GPU waits 1ms in the middle of the frame. This should be the same conclusion, but your new metric will ignore that. GPUWait will report 1ms in both cases.
Idle GPU in between frames can also be due to display stalls, as opposed to CPU dependencies, so prioritizing waits in that particular place does not seem like a better approach.
But, what if the GPU waits 1ms in the middle of the frame. This should be the same conclusion, but your new metric will ignore that. GPUWait will report 1ms in both cases.
Oh I don't mean replace GPUWait......the new additional metric would only show 1ms if there is 1ms between gpu render finish and gpu render start. From what I saw, GPUWait does only show waits during a frame, not the idle time between 2 frames(even in cpu bound scenario where the gpu is only active 50% you still see a lot of frames where GPUWait is 0)
Idle GPU in between frames can also be due to display stalls, as opposed to CPU dependencies, so prioritizing waits in that particular place does not seem like a better approach.
TBH I have not tested this with v-sync or other sync scenarios, but I would say this is the same as when others metrics imply you are cpu limited: It could be a framelimiter, it could be memory etc. etc. In the end the metric is still correct, it simply is "GPUidletimebetweenframes" or w/e name and it shows just that....it complements other stats like GPUWait(now you see both idle times between and during gpu renders) and as a bonus people can use it for determining if gpu or cpu is bottlenecking at their own discretion.
OK, I see what you mean now thanks. I'm still not sure we need a new metric per se -- how about we just add the GPU time between frames to GPUWait?
That would work great for me, but I think other people already using GPUWait for things won't be happy. How would you determine gpu render endpoint if you change GPUWait?