imgui icon indicating copy to clipboard operation
imgui copied to clipboard

Added power saving mode (docking branch)

Open corentin-plouet opened this issue 3 years ago • 12 comments

This is the docking/viewport branch version of the power saving mode implemented in #2749. That PR has a lot of background regarding how that feature came to be.

The next steps of the plan are:

  • ~~Make sure I did not screw up the cherry-pick~~
  • Merge in the enhancements made by @bvgastel
  • See how we can make this work properly with multiple viewports (@ocornut's comment provides good guidance; i.e. we should try to make cursor blinking and animation per viewport, but otherwise make any user input refresh all of them).

corentin-plouet avatar Apr 26 '21 12:04 corentin-plouet

@ocornut / @rokups could we please have the CI enabled for this PR if possible? It's touching examples across multiple platforms, so would be very helpful. Thanks!

corentin-plouet avatar Apr 26 '21 12:04 corentin-plouet

Done. (I didn’t realize workflows now needed approval!)

ocornut avatar Apr 26 '21 13:04 ocornut

I didn’t realize workflows now needed approval!

You can thank crypto for that!

https://github.blog/2021-04-22-github-actions-update-helping-maintainers-combat-bad-actors/

https://layerci.com/blog/crypto-miners-are-killing-free-ci/

PathogenDavid avatar Apr 26 '21 19:04 PathogenDavid

any improvement regarding the cpu usage? really, it's ages that it's working, should be merged... without cpu usage reduction what the point of having a blast fast GUI when even a calc app consume 30% cpu???? how can you imagine having 30+ imgui based app running each consuming 30% CPU?

jlmxyz avatar Aug 27 '21 07:08 jlmxyz

Typical Dear ImGui application consumes next to no CPU. If your application consumes 30% of CPU then i suggest looking for possible inefficiencies in your code. Here is what CPU consumption of sample application looks like for me: image

rokups avatar Aug 30 '21 07:08 rokups

That is not in line with my own observations. For the demo window, I typically get on Linux without this MR:

  • 8-9% CPU usage glfw + opengl3 example;
  • ~6% CPU usage sdl + opengl3 example.

Drops to 0% with my energy branch.

@rokups Some examples/backends/platforms support a check if a window is visible, and if the window is not visible don't draw. Is your window visible when you look at the CPU usage?

bvgastel avatar Aug 31 '21 07:08 bvgastel

It is visible. That reeding depends on some settings. When value is "scaled to 100%" it shows 0.0%, when scaling is disabled it shows 2%. Probably also depends on CPU..

rokups avatar Aug 31 '21 09:08 rokups

For simple UI that fixed cost often come more the swap for which mileage may vary depending on drivers/gpu/os.

Dear ImGui was initially designed as a game overlay so this hasn’t been a priority for a while but I agree this is a feature we should be looking at merging, we will eventually, just juggling with tasks.

ocornut avatar Aug 31 '21 09:08 ocornut

Hi, On my laptop, for the demo window, on master branch from ocornut, synced today (321b84f01fbf3f64db6dbb53deb707d150ce30dc) I get about 10-12% CPU usage glfw + opengl3 example; the issue is that this CPU usage is ALWAYS whatever the state of the window :

  • the window can be obscured
  • the window can be reduced to tray
  • the window can be on another workspace

the window has not a single event and still consume CPU open 10 demo window, you killed the CPU.... so imagine a whole desktop with a 10 text editors..... that kills a i7 cpu.... not sure it's something everyone wants...

Le mar. 31 août 2021 à 11:57, omar @.***> a écrit :

For simple UI that fixed cost often come more the swap for which mileage may vary depending on drivers/gpu/os.

Dear ImGui was initially designed as a game overlay so this hasn’t been a priority for a while but I agree this is a feature we should be looking at merging, we will eventually, just juggling with tasks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ocornut/imgui/pull/4076#issuecomment-909085052, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJUDLRFOXLUXPYKZ5VBPUETT7SRQPANCNFSM43SX6NZQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

jlmxyz avatar Sep 08 '21 21:09 jlmxyz

I tried to cherry-pick again this PR onto the latest docking branch. It was successful, but I had to re-apply manually the changes in imgui_impl_glfw (since this backend had been reviewed extensively).

You can see the results in the commits here (this is a branch that cherry picks and adapts your changes)

Summary:

  • f558d3707b631f9d57fc77a78d8ac41e7e15500b is a cherry pick of your changes (but without the glfw changes)
  • ea416c85c54f076604726cc048ba4af9264f08ac reapplies the changes for the glfw backend
  • Also, I had to add SDLWindow *, GLFWwindow *, HWND params to the various ImGui_ImplXXXX_WaitForEvent functions (see 338fdf36234bf456935db5b451d6754825cbbab5 and ce41ede7232ad2de740dcb3214d73796a9c46ce5)

Additional notes:

  • I also added a modification in order to handle buttons with the repeat flag (buttons that trigger continuously as long as the mouse is down) See commit 986982ac89f11e077b00f81e10421266a78e6f88 , which is probably still a draft).

Benchmarks:

On my Mac (MacBook Pro Intel 2019), I have the following results, when using the backends SDL+OpenGL3 or GLFW+OpenGL3

As you see, without the powersave changes the GPU jumps to 24% (and this causes my mac to become very hot, very quickly)

Mode CPU Usage GPU Usage
powersave / idle 0% 0%
powersave / interacting 7.8% 8%
standard / idle 5.1% 24% (!)
standard / interacting 9.5 19%

pthom avatar Mar 25 '22 22:03 pthom

Iet's say a should_wait variable, but a better way would be to still render a few frames every second. Let's say 2.

For an alternative approach, you could also look at https://github.com/ocornut/imgui/wiki/Implementing-Power-Save,-aka-Idling-outside-of-ImGui

pthom avatar Oct 19 '23 22:10 pthom

I've implemented something to reduce GPU usage for my desktop application, which may have GL views embedded via a render list callback (AddCallback) and some other text controls which may want to be updated at high frequency. My test has been run on a 7840HS laptop with 165Hz, 2560x1600 screen, in three stages of adaption. I've chosen to use power usage measured with turbostat (pkgWatt) instead of GPU usage directly, since GPU usage via radeontop was more complex to analyse.

Before - Free-running at screen refresh rate of 165Hz:

Idling: 12W - GPU usage: 70% clip, 50% fragment Fast Update Mode: 12W - GPU usage: 70% clip, 50% fragment

Before - Free-running at screen refresh rate of 60Hz:

Idling: 6.5W - GPU usage: 30% clip, 25% fragment Fast Update Mode: 6.5W - GPU usage: 30% clip, 25% fragment

Fast Update Mode is a game-like mode where at least some parts of my app need 60Hz updates.

For both of my adaptations, I chose to lock the maximum update rate to 60Hz in the app as well as with the screen.

With simple event updates and additional full-screen rendering in Fast Update Mode:

Idling: 4.8W - GPU Usage 0% Light Usage: ~7.0W - GPU Usage: 40% clip, 30% fragment Fast Update Mode: 6.8W - GPU Usage: 27% clip, 22% fragment Note that here, in fast update mode, since it's only rendering at a higher update rate, some UI labels don't update until you trigger an input event

With simple event updates and partial rendering and on-demand UI rendering:

Idling: 4.8W - GPU Usage 0% Light Usage: ~7.0W - GPU Usage: 40% clip, 30% fragment Fast Update Mode: 5.3W (Rendering just label, 0.1% of the screen) - GPU Usage: 12% clip, 3-4% fragment 5.8W (Rendering label + GL views, 38% of the screen) - GPU Usage: 15% clip, 8% fragment

The last adaptation uses a modified rendering backend implementation that allows me to draw only parts of the screen. Here I use callbacks to render what I need with UI still properly integrated - e.g. my GL views, or even a UI label that I render on-demand in the callback.

Just for fun, the same last adaptation but at 165Hz screen, with fast update mode rendering at ~120Hz:

Idling: 4.8W - GPU Usage 0% Light Usage: ~12W - GPU Usage: 70% clip, 50% fragment Fast Update Mode @ 120Hz: 7.5W (Rendering just label, 0.1% of the screen) - GPU Usage: 30% clip, 8% fragment 8.0W (Rendering label + GL views, 38% of the screen) - GPU Usage: 35% clip, 17% fragment

Note that the GL views in this tests were trivially simple, so only the fillrate should have affected the results. Reading the GPU usage values was often quite difficult, so assume a +- 10 (in absolute % usage) for each reading.

In conclusion:

  1. Event based rendering does a good job of keeping idle power draw at minimum. But when interacting with system, it's still always rendering the whole screen, so some light mouse movements let it spike to the maximum power draw (which depends on the screen refresh rate).
  2. This also does not address the use case when fast visual updates are needed by the application. There, you can avoid an ImGUI New Frame, but that by itself doesn't do much.
  3. Instead, after a quick modification of the rendering backend, you can update only the part of the screen with e.g. your GL view (and still render the UI ontop properly), which yields a good amount of power saving.
  4. Now if the parts of the screen that need updates include text and labels, which would typically require a ImGui update, my custom on-demand text rendering can help, too, and it's super fast to update just that.

I have not tested yet if doing a ImGui New Frame + partial render of areas that changed is a good compromise, but maybe it will be, since the NewFrame (e.g., purely CPU work) did not affect the power draw much for me, the GPU dominated the power draw. I also did not optimise the uploading of vertices to the GPU, as that would be a bit more involved, but I assume the Fast Update Mode values would look a bit better since I am calling the rendering function twice (once for all 3 GL views, once for the label), and it's uploading all the data in each of them, even if it is clipped on the CPU (unless the amd driver skips the upload when it's unused).

While the power draw savings seen here don't seem like much, they are the difference between the laptop staying quiet and the fans being on full blast for me.

Here's my final render loop:

ui.requireUpdates = 3;
while (!glfwWindowShouldClose(ui.window))
{
	auto now = sclock::now();
	ui.deltaTime = dt(ui.renderTime, now)/1000.0f;
	ui.renderTime = now;
	if (ui.requireUpdates)
	{
		ui.requireUpdates--;
		ui.UpdateUI();
		ui.RenderUI(true);
	}
	else if (ui.requireRender)
	{ // Render parts of the screen with OnDemand rendered items
		ui.RenderUI(false);
	}
	ui.requireRender = false;

	long targetIntervalUS = 1000000/60; // RenderUI (glfwSwapBuffer) might force the frequency down to display refresh rate
	long curIntervalUS = dtUS(ui.renderTime, sclock::now());
	if (curIntervalUS < targetIntervalUS)
	{
		std::this_thread::sleep_for(std::chrono::microseconds(targetIntervalUS-curIntervalUS));
	}

	glfwPollEvents();
	while (!ui.requireRender && ui.requireUpdates == 0
		&& context->InputEventsQueue.empty())
	{
		glfwWaitEventsTimeout(0.5f/1000.0f);
	}
	if (!context->InputEventsQueue.empty())
		ui.requireUpdates = std::max(ui.requireUpdates, 3);
}

with UpdateUI just being ImGui's stack from NewFrame to Render, and RenderUI:

void RenderUI(bool fullUpdate)
{
	if (fullUpdate)
	{
		// Render 2D UI with callbacks at appropriate places for 3D GL
		ImGui_ImplOpenGL3_RenderDrawData(ImGui::GetDrawData());
	}
	else
	{
		// Render areas of the screen with on-demand items - rest will be discarded
		for (auto &onDemandState : onDemandStack)
			ImGui_ImplOpenGL3_RenderDrawData(ImGui::GetDrawData(), onDemandState.clipMin, onDemandState.clipMax);
	}

	glfwMakeContextCurrent(window);
	glfwSwapBuffers(window);
}

The label I mentioned is drawn on-demand like this (which internally uses AddCallback):

AddOnDemandText("Frame 00000", [](const ImDrawList* dl, const ImDrawCmd* dc)
{
	RenderOnDemandText(*(OnDemandState*)dc->UserCallbackData, "Frame %d", GetApp().frameNum);
});
// Instead of:
//ImGui::Text("Frame %d", GetApp().frameNum);

And the GL views are setup something like this:

ImDrawList* draw_list = ImGui::GetWindowDrawList();
draw_list->AddCallback([](const ImDrawList* dl, const ImDrawCmd* dc)
{
	glViewport(...)
	glScissor(...);
	glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

	// Render GL Scene
}, nullptr);
draw_list->AddCallback(ImDrawCallback_ResetRenderState, nullptr);
MarkOnDemandArea(ImGui::GetCurrentWindowRead()->InnerRect);

Finally, the patch for the imgui_impl_opengl3.cpp aswell as the custom on-demand rendering code is attached. imgui_onDemand.hpp.txt imgui_onDemand.cpp.txt imgui_impl_opengl3.cpp.diff.txt

Seneral avatar Apr 03 '24 14:04 Seneral