implot
implot copied to clipboard
Performance
Rendering big dataset of candlestick data drops FPS quite a lot.
50k entries dataset brings my FPS down to 30. Running twice as much decreases it even more. Running GLFW + OGL 3.3. Windows 10, 2080Ti. MSAA & software anti-aliasing is disabled.
Candlestick rendering is taken from implot_demos repository. Update function doesn't do anything else than rendering. How can I improve my performance? Is there anything like instancing? Or maybe switching to different imgui backend might improve my performance?
I assume you try to render all the candles even if they are not within the viewport.
My first guess would be to try to render only by chunks the candles that are within the viewable area.
I will experiment with this in a few hours and will post my code if i can get it to render hundreds of thousands of candles :)
Well, I was actually hoping to have good performance when zoomed out on large datasets as well :)
But I'd like to see a snippet how to do proper culling on them as well.
ok, i am not sure how to limit the zoom, seems like i can get a starting zoom but i cant limit it.
What I would do is to:
- Limit X axis based on a fixed number of candles to show at once.
- if user wish to view a bigger interval he need to switch candle size lets say from a minute to an hour/month/etc.
However i did BOOST my FPS by rendering only the candles that are in the viewport, and seems like the FitPoint was also needed to be limited because it was dropping my FPS by a ton.
I was dealing with a dataset of 1 million candles which wasn't impacting my performance by much.
I am not 100% confident I used the right variable I am fairly new to this library but it looks like it gets the job done maybe @epezent Can confirm that.
Here is the function I ended up with:
void PlotCandlestick(const char* label_id, const double* xs, const double* opens, const double* closes, const double* lows, const double* highs, int count, bool tooltip, float width_percent, ImVec4 bullCol, ImVec4 bearCol)
{
// get ImGui window DrawList
ImDrawList* draw_list = ImPlot::GetPlotDrawList();
// calc real value width
double half_width = count > 1 ? (xs[1] - xs[0]) * width_percent : width_percent;
// custom tool
if (ImPlot::IsPlotHovered() && tooltip)
{
ImPlotPoint mouse = ImPlot::GetPlotMousePos();
mouse.x = ImPlot::RoundTime(ImPlotTime::FromDouble(mouse.x), ImPlotTimeUnit_Day).ToDouble();
float tool_l = ImPlot::PlotToPixels(mouse.x - half_width * 1.5, mouse.y).x;
float tool_r = ImPlot::PlotToPixels(mouse.x + half_width * 1.5, mouse.y).x;
float tool_t = ImPlot::GetPlotPos().y;
float tool_b = tool_t + ImPlot::GetPlotSize().y;
ImPlot::PushPlotClipRect();
draw_list->AddRectFilled(ImVec2(tool_l, tool_t), ImVec2(tool_r, tool_b), IM_COL32(128, 128, 128, 64));
ImPlot::PopPlotClipRect();
// find mouse location index
int idx = BinarySearch(xs, 0, count - 1, mouse.x);
// render tool tip (won't be affected by plot clip rect)
if (idx != -1)
{
ImGui::BeginTooltip();
char buff[32];
ImPlot::FormatDate(ImPlotTime::FromDouble(xs[idx]), buff, 32, ImPlotDateFmt_DayMoYr, ImPlot::GetStyle().UseISO8601);
ImGui::Text("Day: %s", buff);
ImGui::Text("Open: $%.2f", opens[idx]);
ImGui::Text("Close: $%.2f", closes[idx]);
ImGui::Text("Low: $%.2f", lows[idx]);
ImGui::Text("High: $%.2f", highs[idx]);
ImGui::EndTooltip();
}
}
// begin plot item
if (ImPlot::BeginItem(label_id))
{
// override legend icon color
ImPlot::GetCurrentItem()->Color = IM_COL32(64, 64, 64, 255);
ImPlotContext& gp = *GImPlot;
ImPlotPoint plot_start = ImPlot::PixelsToPlot(gp.CurrentPlot->AxesRect.Min.x, 0);
ImPlotPoint plot_end = ImPlot::PixelsToPlot(gp.CurrentPlot->AxesRect.Max.x, 0);
// fit data if requested
if (ImPlot::FitThisFrame())
{
for (int i = 0; i < count; ++i)
{
if (xs[i] >= plot_start.x && xs[i] <= plot_end.x)
{
ImPlot::FitPoint(ImPlotPoint(xs[i], lows[i]));
ImPlot::FitPoint(ImPlotPoint(xs[i], highs[i]));
}
}
}
// render data
for (int i = 0; i < count; ++i)
{
if (xs[i] >= plot_start.x && xs[i] <= plot_end.x)
{
ImVec2 open_pos = ImPlot::PlotToPixels(xs[i] - half_width, opens[i]);
ImVec2 close_pos = ImPlot::PlotToPixels(xs[i] + half_width, closes[i]);
ImVec2 low_pos = ImPlot::PlotToPixels(xs[i], lows[i]);
ImVec2 high_pos = ImPlot::PlotToPixels(xs[i], highs[i]);
ImU32 color = ImGui::GetColorU32(opens[i] > closes[i] ? bearCol : bullCol);
draw_list->AddLine(low_pos, high_pos, color);
draw_list->AddRectFilled(open_pos, close_pos, color);
}
}
// end plot item
ImPlot::EndItem();
}
}
NOTE: I rendered 7579 candles at full FPS until ImGui Assert crash with: Too many vertices in ImDrawList using 16-bit indices. Read comment above
Maybe ImPlot should figure a way to split draw lists if more vertices queued than possible to render with imgui.
Too many vertices in ImDrawList using 16-bit indices. Read comment above
I think this can be dealt with. See imGui imconfig.h:
//---- Use 32-bit vertex indices (default is 16-bit) is one way to allow large meshes with more than 64K vertices.
// Your renderer backend will need to support it (most example renderer backends support both 16/32-bit indices).
// Another way to allow large meshes while keeping 16-bit indices is to handle ImDrawCmd::VtxOffset in your renderer.
// Read about ImGuiBackendFlags_RendererHasVtxOffset for details.
//#define ImDrawIdx unsigned int
Sorry for hijacking this thread but I'm looking at getting more performance out of my app, too.
I will have multiple data streams coming in over TCP/IP at update rate ~14 Hz max. The individual data traces would be in 10k .. 500k points. I'm drawing with basic PlotLine at the moment. I'm running an ImGui example with calls to ImPlot for the test, default FPS is 60. On my machine two traces of 100k each still retain good FPS at around 60, but one of the CPU cores is at ~100%. I would like to keep FPS high.
Given that my data is likely to be the same over ~4 iterations of render loop (60/14) I was thinking of caching the computation results from the RenderLineStrip() (at a glance this is where draw list is constructed) until a new data comes from the network. This idea comes after looking at perf output that points to PlotLine and glibc memmove as biggest CPU cycle consumers.
Maybe using other 7 cores on the CPU for PlotLine computation would be an avenue to explore to keep the FPS high.
Also, rendering only subset of points that are actually on screen is an option for me, some times, depending on what the user would be looking at.
Any comments on the above are welcomed!
@hinxx When there are thousands points to render it make sense to use downsampling before plotting. For example using this algorithm https://github.com/sveinn-steinarsson/flot-downsample demo here https://www.base.is/flot/ and c++ port https://gist.github.com/gorbatschow/ce36c15d9265b61d12a1be1783bf0abf
That look like a very good approach for me @gorbatschow ! Will test and report ASAP.
I can shave off half of CPU cycles from a 100 000 points, that would originally be plotted, when using some small threshold (i.e. 1000). It is interesting to see that using threshold of 100 or 10 000 makes results in negligible CPU usage change compared to 1000.