imgui icon indicating copy to clipboard operation
imgui copied to clipboard

Backends: DX9: add programmable rendering pipeline implementation

Open Demonese opened this issue 4 years ago • 3 comments
trafficstars

Before

The old pull request #3844 is too messy, so I create a new pr to discuss.

I think this pull request won't be merged for a long time:

  • The master branch is under active development, but I may not be able to merge changes in the master branch in time
  • Direct3D9 is a 20-year-old API set, is very outdated, is used very rarely now
  • The performance improvement is not obvious (see below)

Thanks to @ocornut, bring us such an excellent GUI library 👍.

Why?

So why did I put so much effort into writing this implementation?

  • Current implementation need to convert the vertices. Yes, we can #define IMGUI_OVERRIDE_DRAWVERT_STRUCT_LAYOUT struct { ImVec2 pos; float z; ImVec2 uv; ImU32 col; };, but what about z? Can we ignore it whatever it is? Ok, we will also need to setup our custom MemAlloc and MemFree, but this is not very elegant.
  • My legecy project is using d3d9 programmable rendering pipeline (they still running on older devices).
  • Many online tutorials, blogs only touch fixed rendering pipeline of d3d9, or still using abandoned D3DX Effect library (in DirectX SDK). I share this implementation as a example to help others get rid of the legacy D3DX Effect library.

Features

  • Faster vertices copy without convertion (if IMGUI_USE_BGRA_PACKED_COLOR defined), and smaller vertex size result in faster upload
  • It can automatically enable support for programmable rendering pipelines without more consideration. (device compatible)
  • I didn't modify imgui_impl_dx9.h file. (API compatible)
  • It can compiled under C++98. (same as current implementation, no build issue)
  • It does not require other libraries (same as current implementation, only extra links to d3d9.lib, no MSVC link issue)

Test

Test code:

  • imconfig.h
#define IMGUI_USE_BGRA_PACKED_COLOR
  • imgui_impl_dx9.cpp
#include <stdio.h>
#include <Windows.h>
LARGE_INTEGER freq, t1, t2;
int timer = 0;
double times = 0.0f;

// Render function.
void ImGui_ImplDX9_RenderDrawData(ImDrawData* draw_data)
{
    ::QueryPerformanceCounter(&t1);

    ...

    ::QueryPerformanceCounter(&t2);
    times += (double)(t2.QuadPart - t1.QuadPart) * 1000000.0 / (double)freq.QuadPart;
    timer += 1;
    if ((timer % 60) == 0)
    {
        static char buffer[32];
        snprintf(buffer, 32, "%.3f\n", times / 60.0);
        OutputDebugStringA(buffer);
        times = 0.0f;
    }
}

bool ImGui_ImplDX9_Init(IDirect3DDevice9* device)
{
    ::QueryPerformanceFrequency(&freq);
    
    ...
}
  • main.cpp
...

ImDrawList* renderer = ImGui::GetBackgroundDrawList();
for (float x = 0.0f; x < 2048.0f; x += 8.0f)
{
    for (float y = 0.0f; y < 2048.0f; y += 8.0f)
    {
        renderer->AddRectFilled(ImVec2(x, y), ImVec2(x + 4.0f, y + 4.0f), IM_COL32(255, 255, 255, 128), 2.0f, 64);
    }
}

...

Build on Release config, running result:

  • Fixed rendering pipeline 31
  • Programmable rendering pipeline 32

Submitting the contents != rendering. Present is the best way to guarantee everything has been fully processed.

For those who need this...

You are free to merge this pr into your own fork. This implementation is already working on our old devices, although I didn't do a whole lot of testing. But, anyway, I don't guarantee it's bug-free. If you have any questions, feel free to leave a comment.

Demonese avatar Mar 04 '21 06:03 Demonese

Thanks for the cleaned up PR and explanation!

I am very surprised the fps is so low. Are you running on a very old computer? I would expect to get 1000+ fps on most machines. Printing to system console is slow you can remove this before checking the framerate.

ocornut avatar Mar 04 '21 07:03 ocornut

I am very surprised the fps is so low. Are you running on a very old computer? I would expect to get 1000+ fps on most machines. Printing to system console is slow you can remove this before checking the framerate.

65536 white rects & Intel integrated graphics card (awesome), cause low fps. Our project using ImDrawList even more heavily, so I have to pay attention to the performance penalty of vertex conversions.

This is the normal FPS:

image

Demonese avatar Mar 04 '21 07:03 Demonese

Reorganized the code. Now the newly inserted code is easier to recognize.

However, some duplicate codes have also been generated, and I am still thinking about how to solve it.

Demonese avatar Nov 14 '21 12:11 Demonese