Cinder-VR icon indicating copy to clipboard operation
Cinder-VR copied to clipboard

High CPU usage due to time spent in OpenGL driver

Open fschliem opened this issue 8 years ago • 4 comments

This may not be a cinder-vr specific thing, but it is very obvious when looking at frame timings on the SteamVR developer window.

First - my system: i7 6700, ATI R390 4Gb, 64Gb Win10.

I tried this on one of the samples - controllerIntermediate which doesn't do much in terms of rendering (i.e. nothing fancy, not too many items). This should be almost zero CPU beyond the initial setup, with minimal GPU.

The average frame breakdown for controllerIntermediate is:

CPU:
  Idle                     ~4ms
  Compositor             ~0.2ms
  Application (other)      ~6ms   <<< This should be ZERO!
  Application (scene)      ~1ms
  Late Start:               0ms

GPU:
  Idle                     ~7ms
  Other                  ~0.2ms
  Compositor               ~1ms
  Application (other)      ~3ms
  Application (scene)      ~0ms

By comparison, Google Earth does:

CPU:
  Idle                   ~8.5ms
  Compositor             ~0.2ms
  Application (other)      ~0ms
  Application (scene)    ~2.5ms
  Late Start:               0ms

GPU:
  Idle                     ~8ms
  Other                  ~0.1ms
  Compositor               ~1ms
  Application (other)      ~3ms
  Application (scene)      ~0ms

And The Lab which uses DirectX:

CPU:
  Idle                     ~9ms
  Compositor             ~0.2ms
  Application (other)      ~0ms
  Application (scene)    ~1.8ms
  Late Start:               0ms

GPU:
  Idle                     ~3ms
  Other                  ~0.1ms
  Compositor               ~1ms
  Application (other)      ~7ms
  Application (scene)      ~0ms

CodeXL (http://gpuopen.com/compute-product/codexl/) profiles show that controllerIntermediate is spending most of the CPU time on my driver (DrvPresentBuffers) while Earth spends most of its CPU time on the app itself.

Even though both The Lab and Earth use way more of the GPU, their CPU usage is minimal.

Same goes for the OpenVR hello_vr sample which uses straight up OpenGL and SDL.

CPU:
  Idle                    ~10ms
  Compositor             ~0.2ms
  Application (other)      ~0ms
  Application (scene)    ~0.8ms
  Late Start:               0ms

GPU:
  Idle                     ~7ms
  Other                  ~0.1ms
  Compositor               ~1ms
  Application (other)      ~3ms
  Application (scene)      ~0ms

For the complexity of the samples, I'd expect them to be on par with hello_vr, perhaps with a 10% perf penalty for using a convenient library like Cinder. Right now it is about 700% penalty.

Given that all this time is going to the driver, my guess is that there's some poor configuration setting somewhere that is forcing the driver down a slow path for no good reason or something along the line, which is why I'm opening this hoping that someone more familiar with the interaction between OpenGL and Cinder has a better answer.

Thanks.

fschliem avatar Mar 01 '17 23:03 fschliem

What does the hello_vr sample draw?

chaoticbob avatar Mar 02 '17 02:03 chaoticbob

I thought you modeled Cinder-VR off that sample ...

It produces an 20x20x20 matrix of cubes that have a large 4k by 4k texture, while you're in the center of the volume.

The main loop calls a Render method for each eye render target FBO, then blits them to a resolved target for each eye and sends them to the compositor. The Render method for each eye is a call of glDrawArrays for most of the scene with a simple textured shader, folllowed by some GL_LINES for the controller axes followed by the same shaders you have in Cinder-VR for the controllers and base station rendering.

I see the same timing profile for all the samples. I went into the draw method of teleportationBasic and reduced it to just:

void TeleportBasicApp::draw()
{
	if( mHmd ) {
		mHmd->bind();
		for( auto eye : mHmd->getEyes() ) {
			mHmd->enableEye( eye );	
			//drawScene();
			//mHmd->drawControllers( eye );
		}
		mHmd->unbind();
		mHmd->submitFrame();
	}
}

... so as to remove the rendering specific cost and just look at the cpu usage for the bootstrap code, and I get the same timing profile. This is a no-op, the headset mirror just draws the chaperon bounds on top of black, yet I see the same high CPU usage.

fschliem avatar Mar 02 '17 06:03 fschliem

Sorry to nit pick, when you say the same - are you referring to your previous estimations or the hello_vr times?

The drawing code inside the Cinder samples are not efficient at all. When I wrote them, I didn't have time to cache the geometry to be GL friendly. The cost of construct and uploading buffers isn't incurred for each shape at every frame.

chaoticbob avatar Mar 02 '17 17:03 chaoticbob

  1. All Cinder-VR samples have high CPU usage that matches this frame timing profile on my machine:
CPU:
  Idle                     ~4ms
  Compositor             ~0.2ms
  Application (other)      ~6ms   <<< This should be ZERO!
  Application (scene)      ~1ms
  Late Start:               0ms

GPU:
  Idle                     ~7ms
  Other                  ~0.2ms
  Compositor               ~1ms
  Application (other)      ~3ms
  Application (scene)      ~0ms

As noted above, the 6ms (>50% of the allotted CPU time) should be closer to zero. This is the issue.

  1. When I strip the TeleportBasicApp sample to no-op the draw calls (so that all we have is the boilerplate mechanism to submit frames to the compositor) I get the exact same high CPU usage timing profile as all the other Cinder-VR samples. This is to show that the problem lies on the boilerplate infrastructure and not any of the specific sample rendering.

  2. As a point of comparison, I profiled some other apps that use OpenGL and OpenVR - Google Earth and the OpenVR hellovr_opengl sample. The CPU usage in those profiles (provided above) is much smaller than the stripped down boilerplate code above.

fschliem avatar Mar 02 '17 18:03 fschliem