imgui icon indicating copy to clipboard operation
imgui copied to clipboard

Word Wrapping in Mixed-Language Markdown Rendering And Quick Fix (CalcWordWrapPosition)

Open xuboying opened this issue 2 months ago • 13 comments

Version/Branch of Dear ImGui:

version 1f7f1f54, tag: v1.92.2b-docking

Back-ends:

imgui_impl_glfw imgui_impl_opengl3

Compiler, OS:

Windows 11 + MSVC 2022

Full config/build information:

set(APP_NAME basic_imgui)

file(GLOB IMGUI_SRC
    ${THIRDPP_DIR}/imgui/*.cpp
    ${THIRDPP_DIR}/imgui/backends/imgui_impl_glfw.cpp
    ${THIRDPP_DIR}/imgui/backends/imgui_impl_opengl3.cpp

)

file(
    GLOB "${APP_NAME}_SOURCES"
    ${CMAKE_CURRENT_SOURCE_DIR}/main.cpp
)

list(APPEND ${APP_NAME}_SOURCES
    ${APP_DIR}/dpi.manifest # can be omitted
)

add_executable(
    ${APP_NAME}
    ${${APP_NAME}_SOURCES}
    ${IMGUI_SRC}
)

target_compile_definitions(
    ${APP_NAME} PRIVATE
    IMGUI_ENABLE_FREETYPE_PLUTOSVG
    IMGUI_USE_WCHAR32
)

target_include_directories(
    ${APP_NAME} PRIVATE
    ${THIRDPP_DIR}/imgui
    ${THIRDPP_DIR}/imgui/backends
)
target_link_libraries(${APP_NAME} PRIVATE
    plutosvg
    glfw
    glad
)

Details:

I'm building a Markdown text renderer and using the RenderTextWrapped function from imgui_markdown. However, I'm encountering unsatisfactory wrapping behavior when rendering paragraphs that mix Chinese and English, or Emoji and English.

Problem Description

In mixed-language paragraphs, the renderer wraps too early—typically at spaces between English words—even when the line could visually accommodate more characters. This results in suboptimal layout, especially for Chinese or Emoji-rich content.

Example (before fix):

Before Fix

After debugging, I traced the issue to ImFont::CalcWordWrapPosition, which is internally used by RenderTextWrapped. The function uses an inside_word heuristic to determine wrap points, but it treats all characters—including Asian characters and emoji—as part of a word. This causes premature wrapping when these characters are adjacent to English text.

Proposed Fix

To address this, I made a minimal patch with the following goals:

  1. Preserve existing behavior for all current use cases.
  2. Introduce a new concept: char_is_asian_or_emoj. These characters are treated similarly to space characters for wrapping logic. Specifically, if the previous character is Asian or Emoji, prev_word_end is updated to allow wrapping after it.
  3. Improve punctuation handling: The original logic doesn't handle punctuation before words well. I applied similar logic to ensure punctuation doesn't interfere with wrap decisions.

Example (after fix):

After Fix

Limitations

Chinese punctuation is not yet handled and may still cause awkward breaks.

Code is minimally tested and may contain redundant logic or edge cases not yet covered.


 git diff -- .\imgui_draw.cpp
diff --git a/imgui_draw.cpp b/imgui_draw.cpp
index 63f14a47..f0c1e8a3 100644
--- a/imgui_draw.cpp
+++ b/imgui_draw.cpp
@@ -5319,7 +5319,7 @@ ImFontBaked* ImFontAtlasBakedGetOrAdd(ImFontAtlas* atlas, ImFont* font, float fo
     ImFontBaked* baked = *p_baked_in_map;
     if (baked != NULL)
     {
-        IM_ASSERT(baked->Size == font_size && baked->ContainerFont == font && baked->BakedId == baked_id);
+        // IM_ASSERT(baked->Size == font_size && baked->ContainerFont == font && baked->BakedId == baked_id);
         return baked;
     }
 
@@ -5352,10 +5352,11 @@ static inline const char* CalcWordWrapNextLineStartA(const char* text, const cha
         text++;
     return text;
 }
-
+#define PATCHWRAP  // for testing purpose only
 // Simple word-wrapping for English, not full-featured. Please submit failing cases!
 // This will return the next location to wrap from. If no wrapping if necessary, this will fast-forward to e.g. text_end.
 // FIXME: Much possible improvements (don't cut things like "word !", "word!!!" but cut within "word,,,,", more sensible support for punctuations, support for Unicode punctuations, etc.)
+#ifdef PATCHWRAP  // for testing purpose only
 const char* ImFont::CalcWordWrapPosition(float size, const char* text, const char* text_end, float wrap_width)
 {
     // For references, possible wrap point marked with ^
@@ -5381,18 +5382,28 @@ const char* ImFont::CalcWordWrapPosition(float size, const char* text, const cha
     const char* word_end = text;
     const char* prev_word_end = NULL;
     bool inside_word = true;
-
     const char* s = text;
     IM_ASSERT(text_end != NULL);
+    bool prev_char_is_asian_or_emoj = false;
+    bool prev_char_is_not_punctuation = false;
     while (s < text_end)
     {
+        bool current_char_is_asian_or_emoj = false;
         unsigned int c = (unsigned int)*s;
         const char* next_s;
         if (c < 0x80)
+        {
             next_s = s + 1;
+            current_char_is_asian_or_emoj = false;
+        }
         else
+        {
             next_s = s + ImTextCharFromUtf8(&c, s, text_end);
-
+            if (c > 0x4e00) {
+                current_char_is_asian_or_emoj = true;
+                inside_word = false;
+            }
+        }
         if (c < 32)
         {
             if (c == '\n')
@@ -5408,7 +5419,7 @@ const char* ImFont::CalcWordWrapPosition(float size, const char* text, const cha
                 continue;
             }
         }
-
+        bool const current_char_is_not_punctuation = (c != '.' && c != ',' && c != ';' && c != '!' && c != '?' && c != '\"' && c != 0x3001 && c != 0x3002);     
         // Optimized inline version of 'float char_width = GetCharAdvance((ImWchar)c);'
         float char_width = (c < (unsigned int)baked->IndexAdvanceX.Size) ? baked->IndexAdvanceX.Data[c] : -1.0f;
         if (char_width < 0.0f)
@@ -5416,7 +5427,7 @@ const char* ImFont::CalcWordWrapPosition(float size, const char* text, const cha

         if (ImCharIsBlankW(c))
         {
-            if (inside_word)
+            if (inside_word || prev_char_is_asian_or_emoj)
             {
                 line_width += blank_width;
                 blank_width = 0.0f;
@@ -5432,15 +5443,23 @@ const char* ImFont::CalcWordWrapPosition(float size, const char* text, const cha
             {
                 word_end = next_s;
             }
-            else
-            {
-                prev_word_end = word_end;
+            else {
+                if (prev_char_is_asian_or_emoj || ! prev_char_is_not_punctuation)
+                {
+                    prev_word_end = word_end = s;
+                    // word_end=s;
+                }
+                else
+                {
+                    prev_word_end = word_end;
+                }
                 line_width += word_width + blank_width;
                 word_width = blank_width = 0.0f;
             }

             // Allow wrapping after punctuation.
-            inside_word = (c != '.' && c != ',' && c != ';' && c != '!' && c != '?' && c != '\"' && c != 0x3001 && c != 0x3002);
+            inside_word = current_char_is_not_punctuation;
+
         }

         // We ignore blank width at the end of the line (they can be skipped)
@@ -5453,8 +5472,9 @@ const char* ImFont::CalcWordWrapPosition(float size, const char* text, const cha
         }

         s = next_s;
+        prev_char_is_asian_or_emoj = current_char_is_asian_or_emoj;
+        prev_char_is_not_punctuation = current_char_is_not_punctuation;
     }
-
     // Wrap_width is too small to fit anything. Force displaying 1 character to minimize the height discontinuity.
     // +1 may not be a character start point in UTF-8 but it's ok because caller loops use (text >= word_wrap_eol).
     if (s == text && text < text_end)
@@ -5462,6 +5482,11 @@ const char* ImFont::CalcWordWrapPosition(float size, const char* text, const cha
     return s;
 }

+
+
+
+#endif
+
 ImVec2 ImFont::CalcTextSizeA(float size, float max_width, float wrap_width, const char* text_begin, const char* text_end, const char** remaining)
 {
     if (!text_end)

for review

#define PATCHWRAP  // for testing purpose only
// Simple word-wrapping for English, not full-featured. Please submit failing cases!
// This will return the next location to wrap from. If no wrapping if necessary, this will fast-forward to e.g. text_end.
// FIXME: Much possible improvements (don't cut things like "word !", "word!!!" but cut within "word,,,,", more sensible support for punctuations, support for Unicode punctuations, etc.)
#ifdef PATCHWRAP  // for testing purpose only
const char* ImFont::CalcWordWrapPosition(float size, const char* text, const char* text_end, float wrap_width)
{
    // For references, possible wrap point marked with ^
    //  "aaa bbb, ccc,ddd. eee   fff. ggg!"
    //      ^    ^    ^   ^   ^__    ^    ^

    // List of hardcoded separators: .,;!?'"

    // Skip extra blanks after a line returns (that includes not counting them in width computation)
    // e.g. "Hello    world" --> "Hello" "World"

    // Cut words that cannot possibly fit within one line.
    // e.g.: "The tropical fish" with ~5 characters worth of width --> "The tr" "opical" "fish"

    ImFontBaked* baked = GetFontBaked(size);
    const float scale = size / baked->Size;

    float line_width = 0.0f;
    float word_width = 0.0f;
    float blank_width = 0.0f;
    wrap_width /= scale; // We work with unscaled widths to avoid scaling every characters

    const char* word_end = text;
    const char* prev_word_end = NULL;
    bool inside_word = true;
    const char* s = text;
    IM_ASSERT(text_end != NULL);
    bool prev_char_is_asian_or_emoj = false;
    bool prev_char_is_not_punctuation = false;
    while (s < text_end)
    {
        bool current_char_is_asian_or_emoj = false;
        unsigned int c = (unsigned int)*s;
        const char* next_s;
        if (c < 0x80)
        {
            next_s = s + 1;
            current_char_is_asian_or_emoj = false;
        }
        else
        {
            next_s = s + ImTextCharFromUtf8(&c, s, text_end);
            if (c > 0x4e00) {
                current_char_is_asian_or_emoj = true;
                inside_word = false;
            }
        }
        if (c < 32)
        {
            if (c == '\n')
            {
                line_width = word_width = blank_width = 0.0f;
                inside_word = true;
                s = next_s;
                continue;
            }
            if (c == '\r')
            {
                s = next_s;
                continue;
            }
        }
        bool const current_char_is_not_punctuation = (c != '.' && c != ',' && c != ';' && c != '!' && c != '?' && c != '\"' && c != 0x3001 && c != 0x3002);
        // Optimized inline version of 'float char_width = GetCharAdvance((ImWchar)c);'
        float char_width = (c < (unsigned int)baked->IndexAdvanceX.Size) ? baked->IndexAdvanceX.Data[c] : -1.0f;
        if (char_width < 0.0f)
            char_width = BuildLoadGlyphGetAdvanceOrFallback(baked, c);

        if (ImCharIsBlankW(c))
        {
            if (inside_word || prev_char_is_asian_or_emoj)
            {
                line_width += blank_width;
                blank_width = 0.0f;
                word_end = s;
            }
            blank_width += char_width;
            inside_word = false;
        }
        else
        {
            word_width += char_width;
            if (inside_word)
            {
                word_end = next_s;
            }
            else {
                if (prev_char_is_asian_or_emoj || ! prev_char_is_not_punctuation)
                {
                    prev_word_end = word_end = s;
                    // word_end=s;
                }
                else
                {
                    prev_word_end = word_end;
                }
                line_width += word_width + blank_width;
                word_width = blank_width = 0.0f;
            }

            // Allow wrapping after punctuation.
            inside_word = current_char_is_not_punctuation;

        }

        // We ignore blank width at the end of the line (they can be skipped)
        if (line_width + word_width > wrap_width)
        {
            // Words that cannot possibly fit within an entire line will be cut anywhere.
            if (word_width < wrap_width)
                s = prev_word_end ? prev_word_end : word_end;
            break;
        }

        s = next_s;
        prev_char_is_asian_or_emoj = current_char_is_asian_or_emoj;
        prev_char_is_not_punctuation = current_char_is_not_punctuation;
    }
    // Wrap_width is too small to fit anything. Force displaying 1 character to minimize the height discontinuity.
    // +1 may not be a character start point in UTF-8 but it's ok because caller loops use (text >= word_wrap_eol).
    if (s == text && text < text_end)
        return s + ImTextCountUtf8BytesFromChar(s, text_end);
    return s;
}




#endif

Screenshots/Video:

No response

Minimal, Complete and Verifiable Example code:

main.cpp

//
#include <string>
//
#include <GLFW/glfw3.h>
#include <imgui.h>
#include <imgui_impl_glfw.h>
#include <imgui_impl_opengl3.h>
#include <misc/freetype/imgui_freetype.h>

namespace
{

auto loadFont() -> ImFont *
{
    ImGuiIO &io = ImGui::GetIO();
    ImFont *font = nullptr;
    ImFontConfig fontCfg;
    font = io.Fonts->AddFontFromFileTTF(R"(C:\Windows\Fonts\arial.ttf)", 0.0F, &fontCfg);
    ImFontConfig fontCfg1;
    fontCfg1.MergeMode = true;
    fontCfg1.FontLoaderFlags |= ImGuiFreeTypeLoaderFlags_LoadColor;
    font = io.Fonts->AddFontFromFileTTF(R"(C:\Windows\Fonts\SimHei.ttf)", 0.0F, &fontCfg1);
    ImFontConfig fontCfgEmoj;
    font = io.Fonts->AddFontFromFileTTF(R"(C:\Windows\Fonts\seguiemj.ttf)", 0.0F, &fontCfg1);
    return font;
}

// based on https://github.com/enkisoftware/imgui_markdown/blob/main/imgui_markdown.h
void RenderTextWrapped(const char *text_, const char *text_end_)
{
#if IMGUI_VERSION_NUM >= 19197
    float fontSize = ImGui::GetFontSize();
#else
    float scale = ImGui::GetIO().FontGlobalScale;
#endif
    float widthLeft = ImGui::GetContentRegionAvail().x;
#if IMGUI_VERSION_NUM >= 19197
    const char *endLine = ImGui::GetFont()->CalcWordWrapPosition(fontSize, text_, text_end_, widthLeft);
#else
    const char *endLine = ImGui::GetFont()->CalcWordWrapPositionA(scale, text_, text_end_, widthLeft);
#endif
    ImGui::TextUnformatted(text_, endLine);
    widthLeft = ImGui::GetContentRegionAvail().x;
    while (endLine < text_end_)
    {
        text_ = endLine;
        if (*text_ == ' ')
        {
            ++text_;
        } // skip a space at start of line
#if IMGUI_VERSION_NUM >= 19197
        endLine = ImGui::GetFont()->CalcWordWrapPosition(fontSize, text_, text_end_, widthLeft);
#else
        endLine = ImGui::GetFont()->CalcWordWrapPositionA(scale, text_, text_end_, widthLeft);
#endif
        if (text_ == endLine)
        {
            endLine++;
        }
        ImGui::TextUnformatted(text_, endLine);
    }
}

std::string textdoc1 = R"(

..................................................word word

// ZH-CN and EN, no space in between

字文字文字字文字文字文字文字字文字文字字文字文字文字文字word word文字文字文字文字文字文字文字文字

// ZH-CN and EN, space in between

字文字文字字文字文字文字文字 word word 字文字文字文字文字文字文。

// Emoj and EN no space in between

😁😁😁😁😁😁😁word word😁😁😁😁😁😁😁😁😁

// Emoj and EN space in between

😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁 word word 😁😁😁😁😁😁😁😁😁😁😁😁😁

---

### Multilingual Test Phrases (Word-Based Languages)

All the following languages are **word-based**. When wrapping text, the line break **must occur at the spaces between words**, and **must not split a word across two lines**.

**English:**  

The quick brown fox jumps over the lazy dog.

**French:**  

Le renard brun rapide saute par-dessus le chien paresseux.

**German:**  

Die schnell braune Fuchs springt über den faulen Hund.

**Spanish:**  

El zorro marrón rápido salta sobre el perro perezoso.

**Russian:**  

Быстрая коричневая лиса прыгает через ленивую собаку.

**Greek:**  

Η γρήγορη καφέ αλεπού πηδάει πάνω από το τεμπέλικο σκυλί.

**Dutch:**  

De snelle bruine vos springt over de luie hond.

)";

} // namespace
auto main() -> int
{
    // Init GLFW
    glfwInit();
    const char *glsl_version = "#version 130";
    glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3);
    glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 0);
    GLFWwindow *window = glfwCreateWindow(800, 600, "Minimal ImGui", nullptr, nullptr);
    glfwMakeContextCurrent(window);
    glfwSwapInterval(1); // Enable vsync

    // Init ImGui
    IMGUI_CHECKVERSION();
    ImGui::CreateContext();
    ImGuiIO &io = ImGui::GetIO();
    (void) io;
    io.IniFilename = nullptr;
    // ImGui::StyleColorsDark(); // or Light()

    ImGui_ImplGlfw_InitForOpenGL(window, true);
    ImGui_ImplOpenGL3_Init(glsl_version);
    auto *font = loadFont();

    // Main loop
    while (glfwWindowShouldClose(window) == 0)
    {
        glfwPollEvents();
        ImGui_ImplOpenGL3_NewFrame();
        ImGui_ImplGlfw_NewFrame();
        ImGui::NewFrame();

        int display_w = 0;
        int display_h = 0;
        glfwGetFramebufferSize(window, &display_w, &display_h);
        ImGui::SetNextWindowPos(ImVec2(0, 0), ImGuiCond_Always);
        ImGui::SetNextWindowSize(ImVec2((float) display_w, (float) display_h), ImGuiCond_Once);

        ImGui::Begin("Full Window", nullptr);
        ImGui::PushFont(font, 20.0F);
        RenderTextWrapped(textdoc1.c_str(), textdoc1.c_str() + textdoc1.size());
        ImGui::PopFont();
        ImGui::End();
        // Render
        ImGui::Render();

        glViewport(0, 0, display_w, display_h);
        glClearColor(0.1f, 0.1f, 0.1f, 1.0f);
        glClear(GL_COLOR_BUFFER_BIT);
        ImGui_ImplOpenGL3_RenderDrawData(ImGui::GetDrawData());
        glfwSwapBuffers(window);
    }

    // Cleanup
    ImGui_ImplOpenGL3_Shutdown();
    ImGui_ImplGlfw_Shutdown();
    ImGui::DestroyContext();
    glfwDestroyWindow(window);
    glfwTerminate();
    return 0;
}

xuboying avatar Nov 12 '25 18:11 xuboying

-        IM_ASSERT(baked->Size == font_size && baked->ContainerFont == font && baked->BakedId == baked_id);
+        // IM_ASSERT(baked->Size == font_size && baked->ContainerFont == font && baked->BakedId == baked_id);

Why this? It seems unrelated, but if you have this assert triggers please report in a separate issue.

Thanks for this. Unfortunately, it is unlikely than in the short term I can merge modifications to CalcWordWrapPosition(). I am fully aware there are many word-wrapping requests: #9066, #8990, #8838, #8503, #8139 The reason is that CalcWordWrapPosition() is in the very hot-path and performance critical for some cases and I would need to devote more serious time into this, but I will at some point.

Apart from above code fix suggestion (thanks), the best thing you can do if to provide an explicit list of examples where word-wrapping is not ideal and how it should behave, as I will later turn all of them into automated test cases (your comment already have some good examples)

Q: can you clarify the decision to use c > 0x4e00 with no upper bounds?

Thanks for your patience!

ocornut avatar Nov 13 '25 12:11 ocornut

IM_ASSERT is unrelated to this issue — please ignore it.

Thanks for sharing [PR #8838]. I did a quick local review and it looks promising. The author also speaks Chinese, which helped me better understand the context of the issue.

From what I can see, the PR introduces two major improvements:

  1. It removes the inside_word state variable from the loop. That variable was not very reliable, and the new approach — using delimiters to segment words — makes the logic more transparent and easier to reason about.

  2. It introduces a 3-character lookahead/lookbehind to detect prohibited characters or pronunciation markers at word boundaries. This is an essential change. I had considered a similar idea before but hadn’t spent enough time on the task to work out a clean implementation. This PR effectively addresses that limitation.

Performance-wise, I don’t think the PR introduces significant degradation. The logic remains a single-pass check. While some micro-optimizations could be explored by language experts, the structure is straightforward and readable.

Regarding Unicode range handling: the PR uses 0x3003 <= c && c <= 0xFFFF, while my fix uses c > 0x4e00. The PR misses some high-range emoji characters, and my version omits Japanese and Korean ranges. I think a better lower bound would be 0x3003 , and emoji coverage should be explicitly included.

Regarding

Apart from above code fix suggestion (thanks), the best thing you can do if to provide an explicit list of examples where word-wrapping is not ideal and how it should behave, as I will later turn all of them into automated test cases (your comment already have some good examples)

I will collect the test data also.

PS

I'll try integrating the PR into my project later — it looks like a solid baseline for further testing.

The PR might be renamed to CalcWordWrapPositionEx() in my project so it won't impact other functions at this moment.

xuboying avatar Nov 13 '25 17:11 xuboying

Hi again,

I’ve already integrated the code into my mini project, and so far it looks good. To help illustrate, I prepared a minimal working example. The code is a bit long since it combines several parts:

  • Main GUI code
  • Patched code from [PR #8838](https://github.com/ocornut/imgui/pull/8838)
    The PR is based on an older version, so I merged everything together and adjusted the interface so it can be used directly. One important note: the BuildLoadGlyphGetAdvanceOrFallback function must be changed to non-static, otherwise you’ll run into linking issues.
  • ImPlot ([285df95])
    Used in the demo to visualize performance. The unit is milliseconds, and implot_items.cpp needs to be built together.
  • Two test datasets
    One of them is taken from <https://zh.wikipedia.org/wiki/C%2B%2B>.
Image

Word-wrapping behavior

CJK characters don’t follow the same “word” concept as English — you can break after any character (e.g., “啊”, “文”, “字”). In the snapshot, the original function produces a jagged right edge, while the patched version correctly aligns the paragraph with a straight edge.

There is one important rule for CJK wrapping: certain pronunciation marks cannot appear at the end of a line, and others cannot appear at the beginning. This is similar to rules in English typography. The PR already summarizes these characters in HeadProhibitedW and TailProhibitedW (btw, "-W" likely derived from Windows API wide-char conventions).

Performance

The performance plot shows a slowdown when using the patched version (snapshot taken from a release build with -O2). I don’t currently have ideas for optimizing the patched logic itself, so I added a caching layer as a workaround. My cached version is included for reference.


HeadProhibitedW (the list also includes some Japanese (Katakana and Hiragana ) which I'm not sure about.)

Code Point Character Description
0x00A2 ¢ Cent sign
0x00B0 ° Degree sign
0x2032 Prime (minutes, feet)
0x2033 Double prime (seconds, inches)
0x2030 Per mille sign
0x2103 Degree Celsius
0x3001 Ideographic comma
0x3002 Ideographic full stop
0xFF61 Halfwidth ideographic full stop
0xFF64 Halfwidth ideographic comma
0xFFE0 Fullwidth cent sign
0xFF0C Fullwidth comma
0xFF0E Fullwidth full stop
0xFF1A Fullwidth colon
0xFF1B Fullwidth semicolon
0xFF1F Fullwidth question mark
0xFF01 Fullwidth exclamation mark
0xFF05 Fullwidth percent sign
0x30FB Katakana middle dot
0xFF65 Halfwidth Katakana middle dot
0x309D Hiragana iteration mark
0x309E Hiragana voiced iteration mark
0x30FD Katakana iteration mark
0x30FE Katakana voiced iteration mark
0x30FC Katakana long sound mark
0x30A1 Katakana small A
0x30A3 Katakana small I
0x30A5 Katakana small U
0x30A7 Katakana small E
0x30A9 Katakana small O
0x30C3 Katakana small Tsu
0x30E3 Katakana small Ya
0x30E5 Katakana small Yu
0x30E7 Katakana small Yo
0x30EE Katakana small Wa
0x30F5 Katakana small Ka
0x30F6 Katakana small Ke
0x3041 Hiragana small A
0x3043 Hiragana small I
0x3045 Hiragana small U
0x3047 Hiragana small E
0x3049 Hiragana small O
0x3063 Hiragana small Tsu
0x3083 Hiragana small Ya
0x3085 Hiragana small Yu
0x3087 Hiragana small Yo
0x308E Hiragana small Wa
0x3095 Hiragana small Ka
0x3096 Hiragana small Ke
0x31F0 Katakana small Ku (extended)
0x31F1 Katakana small Shi
0x31F2 Katakana small Su
0x31F3 Katakana small To
0x31F4 Katakana small Nu
0x31F5 Katakana small Ha
0x31F6 Katakana small Hi
0x31F7 Katakana small Fu
0x31F8 Katakana small He
0x31F9 Katakana small Ho
0x31FA Katakana small Mu
0x31FB Katakana small Ra
0x31FC Katakana small Ri
0x31FD Katakana small Ru
0x31FE Katakana small Re
0x31FF Katakana small Ro
0x3005 Ideographic iteration mark
0x303B Vertical ideographic iteration mark
0xFF67 Halfwidth Katakana small A
0xFF68 Halfwidth Katakana small I
0xFF69 Halfwidth Katakana small U
0xFF6A Halfwidth Katakana small E
0xFF6B Halfwidth Katakana small O
0xFF6C Halfwidth Katakana small Ya
0xFF6D Halfwidth Katakana small Yu
0xFF6E Halfwidth Katakana small Yo
0xFF6F Halfwidth Katakana small Tsu
0xFF70 Halfwidth Katakana long sound mark
0x201D Right double quotation mark
0x3009 Right angle bracket
0x300B Right double angle bracket
0x300D Right corner bracket
0x300F Right white corner bracket
0x3011 Right black lenticular bracket
0x3015 Right tortoise shell bracket
0xFF09 Fullwidth right parenthesis
0xFF3D Fullwidth right square bracket
0xFF5D Fullwidth right curly bracket
0xFF63 Halfwidth right corner bracket

TailProhibitedW

Code Point Character Description
0x2018 Left single quotation mark
0x201C Left double quotation mark
0x3008 Left angle bracket
0x300A Left double angle bracket
0x300C Left corner bracket
0x300E Left white corner bracket
0x3010 Left black lenticular bracket
0x3014 Left tortoise shell bracket
0xFF08 Fullwidth left parenthesis
0xFF3B Fullwidth left square bracket
0xFF5B Fullwidth left curly bracket
0xFF62 Halfwidth left corner bracket
0x00A3 £ Pound sign
0x00A5 ¥ Yen sign
0xFF04 Fullwidth dollar sign
0xFFE1 Fullwidth pound sign
0xFFE5 Fullwidth yen sign
0xFF0B Fullwidth plus sign
//
#include <chrono>
#include <string>

//
#include <GLFW/glfw3.h>
#include <imgui.h>
#include <imgui_impl_glfw.h>
#include <imgui_impl_opengl3.h>
#include <imgui_internal.h>
#include <implot.h>
#include <misc/freetype/imgui_freetype.h>

extern float BuildLoadGlyphGetAdvanceOrFallback(ImFontBaked *baked, unsigned int codepoint);
// clang-format off
namespace CalcWordWrapPatch {
    inline bool             ImCharIsHeadProhibitedA(char c) { return c == ' ' || c == '\t' || c == '}' || c == ')' || c == ']' || c == '?' || c == '!' || c == '|' || c == '/' || c == '&' || c == '.' || c == ',' || c == ':' || c == ';';}
    const unsigned int      HeadProhibitedW[] = { 0xa2, 0xb0, 0x2032, 0x2033, 0x2030, 0x2103, 0x3001, 0x3002, 0xff61, 0xff64, 0xffe0, 0xff0c, 0xff0e, 0xff1a, 0xff1b, 0xff1f, 0xff01, 0xff05, 0x30fb, 0xff65, 0x309d, 0x309e, 0x30fd, 0x30fe, 0x30fc, 0x30a1, 0x30a3, 0x30a5, 0x30a7, 0x30a9, 0x30c3, 0x30e3, 0x30e5, 0x30e7, 0x30ee, 0x30f5, 0x30f6, 0x3041, 0x3043, 0x3045, 0x3047, 0x3049, 0x3063, 0x3083, 0x3085, 0x3087, 0x308e, 0x3095, 0x3096, 0x31f0, 0x31f1, 0x31f2, 0x31f3, 0x31f4, 0x31f5, 0x31f6, 0x31f7, 0x31f8, 0x31f9, 0x31fa, 0x31fb, 0x31fc, 0x31fd, 0x31fe, 0x31ff, 0x3005, 0x303b, 0xff67, 0xff68, 0xff69, 0xff6a, 0xff6b, 0xff6c, 0xff6d, 0xff6e, 0xff6f, 0xff70, 0x201d, 0x3009, 0x300b, 0x300d, 0x300f, 0x3011, 0x3015, 0xff09, 0xff3d, 0xff5d, 0xff63};

    inline bool             ImCharIsHeadProhibitedW(unsigned int c) { for (int i = 0; i < IM_ARRAYSIZE(HeadProhibitedW); i++) if (c == HeadProhibitedW[i]) return true; return false;}

    inline bool             ImCharIsHeadProhibited(unsigned int c)  { return (c < 128 && ImCharIsHeadProhibitedA(c)) || ImCharIsHeadProhibitedW(c); }

    inline bool             ImCharIsTailProhibitedA(unsigned int c) { return c == '(' || c == '[' || c == '{' || c == '+'; }

    const unsigned int      TailProhibitedW[] = { 0x2018, 0x201c, 0x3008, 0x300a, 0x300c, 0x300e, 0x3010, 0x3014, 0xff08, 0xff3b, 0xff5b, 0xff62, 0xa3, 0xa5, 0xff04, 0xffe1, 0xffe5, 0xff0b };

    inline bool             ImCharIsTailProhibitedW(unsigned int c) { for (int i = 0; i < IM_ARRAYSIZE(TailProhibitedW); i++) if (c == TailProhibitedW[i]) return true; return false; }

    inline bool             ImCharIsTailProhibited(unsigned int c)  { return (c < 128 && ImCharIsTailProhibitedA(c)) || ImCharIsTailProhibitedW(c); }

    inline bool             ImCharIsLineBreakableW(unsigned int c)  { return (c >= 0x3040 && c <= 0x9fff) || (c >= 0x3400 && c <= 0x4dbf) || (c >= 0x20000 && c <= 0xdffff) || (c >= 0x3040 && c <= 0x30ff) || (c >= 0xac00 && c <= 0xd7ff) || (c >= 0xf900 && c <= 0xfaff) || (c >= 0x1100 && c <= 0x11ff) || (c >= 0x2e80 && c <= 0x2fff); }

// clang-format on

// Simple word-wrapping for English, not full-featured. Please submit failing cases!
// This will return the next location to wrap from. If no wrapping if necessary, this will fast-forward to e.g. text_end.
// FIXME: Much possible improvements (don't cut things like "word !", "word!!!" but cut within "word,,,,", more sensible support for punctuations, support for Unicode punctuations, etc.)
const char *CalcWordWrapPositionEx(ImFont *font, float size, const char *text, const char *text_end, float wrap_width)
{
    // Refactored word wrapping method detects wrapping points by looking for at most 3 consecutive characters.
    // (Currently only 2.)

    ImFontBaked *baked = font->GetFontBaked(size);
    const float scale = size / baked->Size;

    float line_width = 0.0f;
    float word_width = 0.0f;
    float blank_width = 0.0f;
    wrap_width /= scale; // We work with unscaled widths to avoid scaling every characters

    const char *word_end = text;

    const char *prev_s = NULL;
    const char *s = NULL;
    const char *next_s = text;
    unsigned int prev_c = 0;
    unsigned int c = 0;
    unsigned int next_c;
#define IM_ADVANCE_WORD()                                                                                              \
    do                                                                                                                 \
    {                                                                                                                  \
        word_end = s;                                                                                                  \
        line_width += word_width + blank_width;                                                                        \
        word_width = blank_width = 0.0f;                                                                               \
    } while (0)
    IM_ASSERT(text_end != NULL);
    while (s < text_end)
    {
        // prev_s is the END of prev_c, which actually points to c
        // same for s and next_s.
        prev_s = s;
        s = next_s;
        prev_c = c;
        c = next_c;
        next_c = (unsigned int) *next_s;
        if (next_c < 0x80)
            next_s = next_s + 1;
        else
            next_s = next_s + ImTextCharFromUtf8(&next_c, next_s, text_end);
        if (next_s > text_end)
            next_c = 0;

        if (prev_s == NULL)
        {
            continue;
        }
        if (c < 0x20)
        {
            if (c == '\n')
            {
                line_width = word_width = blank_width = 0.0f;
                continue;
            }
            if (c == '\r')
                continue;
        }
        // Optimized inline version of 'float char_width = GetCharAdvance((ImWchar)c);'
        float char_width = (c < (unsigned int) baked->IndexAdvanceX.Size) ? baked->IndexAdvanceX.Data[c] : -1.0f;
        if (char_width < 0.0f)
            char_width = BuildLoadGlyphGetAdvanceOrFallback(baked, c);
        if (ImCharIsBlankW(c))
            blank_width += char_width;
        else
        {
            word_width += char_width + blank_width;
            blank_width = 0.0f;
        }
        // We ignore blank width at the end of the line (they can be skipped)
        if (line_width + word_width > wrap_width)
        {
            // Words that cannot possibly fit within an entire line will be cut anywhere.
            if (word_width < wrap_width)
                s = word_end;
            else
                s = prev_s;
            break;
        }

        if (!next_c)
        {
            IM_ADVANCE_WORD();
        }
        else if (c && next_c)
        {
            if (prev_c >= '0' && prev_c <= '9' && next_c >= '0' && next_c <= '9' && !ImCharIsLineBreakableW(c))
                continue;
            if (ImCharIsLineBreakableW(next_c) && !ImCharIsHeadProhibited(next_c) && !ImCharIsTailProhibited(c))
                IM_ADVANCE_WORD();
            if ((ImCharIsHeadProhibited(c) || ImCharIsLineBreakableW(c)) && !ImCharIsHeadProhibited(next_c))
                IM_ADVANCE_WORD();
        }
    }
#undef IM_ADVANCE_WORD
    // Wrap_width is too small to fit anything. Force displaying 1 character to minimize the height discontinuity.
    // +1 may not be a character start point in UTF-8 but it's ok because caller loops use (text >= word_wrap_eol).
    if (s == text && text < text_end)
        return s + ImTextCountUtf8BytesFromChar(s, text_end);
    return s > text_end ? text_end : s;
}

} // namespace CalcWordWrapPatch
namespace
{
using namespace CalcWordWrapPatch;
inline auto CalcWordEx(
    ImFont *font, float size, const char *text, const char *text_end, float wrap_width, bool useEx) noexcept -> const
    char *
{
    return useEx ? CalcWordWrapPatch::CalcWordWrapPositionEx(font, size, text, text_end, wrap_width)
                 : font->CalcWordWrapPosition(size, text, text_end, wrap_width);
}
} // namespace

namespace cache
{

struct WrapKey
{
    ImFont *font;
    const char *text;
    const char *text_end;
    float size;
    float wrap_width;

    auto matches(ImFont *f, float s, const char *t, const char *t_end, float w) const -> bool
    {
        return font == f && size == s && text == t && text_end == t_end && wrap_width == w;
    }
};

struct WrapEntry
{
    WrapKey key;
    const char *result;
};

class WordWrapCache
{
public:
    WordWrapCache(size_t capacity, size_t reset_interval)
        : capacity_(capacity), reset_interval_(reset_interval), call_count_(0)
    {
    }

    auto get(ImFont *font, float size, const char *text, const char *text_end, float wrap_width) -> const char *
    {
        if (++call_count_ >= reset_interval_)
        {
            entries_.clear();
            call_count_ = 0;
        }

        for (ptrdiff_t i = 0; i < entries_.size(); ++i)
        {
            if (entries_[i].key.matches(font, size, text, text_end, wrap_width))
            {
                if (i != 0)
                {
                    std::rotate(entries_.begin(), entries_.begin() + i, entries_.begin() + i + 1);
                }
                return entries_[0].result;
            }
        }

        const char *result = CalcWordEx(font, size, text, text_end, wrap_width, true);
        WrapEntry entry{
            .key = {.font = font, .text = text, .text_end = text_end, .size = size, .wrap_width = wrap_width},
            .result = result
        };

        if (entries_.size() == capacity_)
        {
            // std::cerr << "full" << rand() << std::endl;
            capacity_ *= 2;
            entries_.pop_back();
        }
        entries_.insert(entries_.begin(), std::move(entry));

        return result;
    }

private:
    size_t capacity_;
    size_t reset_interval_;
    size_t call_count_;
    std::vector<WrapEntry> entries_;
};

} // namespace cache

namespace
{

// based on https://github.com/enkisoftware/imgui_markdown/blob/main/imgui_markdown.h
void RenderTextWrapped(
    const char *text_, const char *text_end_, int version /* 0:origional, 1: patched, 2:patched+cache */)
{
    static cache::WordWrapCache cache(/*capacity=*/10000, 100);
    auto *font = ImGui::GetFont();
    float fontSize = ImGui::GetFontSize();
    float widthLeft = ImGui::GetContentRegionAvail().x;
    const char *endLine = nullptr;
    if (version == 2)
    {
        endLine = cache.get(font, fontSize, text_, text_end_, widthLeft);
    }
    else
    {
        endLine = CalcWordEx(font, fontSize, text_, text_end_, widthLeft, bool(version));
    }
    ImGui::TextUnformatted(text_, endLine);
    widthLeft = ImGui::GetContentRegionAvail().x;
    while (endLine < text_end_)
    {
        text_ = endLine;
        if (*text_ == ' ')
        {
            ++text_;
        }
        if (version == 2)
        {
            endLine = cache.get(font, fontSize, text_, text_end_, widthLeft);
        }
        else
        {
            endLine = CalcWordEx(font, fontSize, text_, text_end_, widthLeft, bool(version));
        }
        if (text_ == endLine)
        {
            endLine++;
        }
        ImGui::TextUnformatted(text_, endLine);
    }
}

std::string textdoc1 = R"(


..................................................word word

// ZH-CN and EN, no space in between

字文字文字字文字文字文字文字字文字文字字文字文字文字文字word word文字文字文字文字文字文字文字文字

// ZH-CN and EN, space in between

字文字文字字文字文字文字文字 word word 字文字文字文字文字文字文。

// Emoj and EN no space in between

😁😁😁😁😁😁😁word word😁😁😁😁😁😁😁😁😁

// Emoj and EN space in between

😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁 word word 😁😁😁😁😁😁😁😁😁😁😁😁😁



### Multilingual Test Phrases (Word-Based Languages)

All the following languages are **word-based**. When wrapping text, the line break **must occur at the spaces between words**, and **must not split a word across two lines**.

**English:**  

The quick brown fox jumps over the lazy dog.

**French:**  

Le renard brun rapide saute par-dessus le chien paresseux.

**German:**  

Die schnell braune Fuchs springt über den faulen Hund.

**Spanish:**  

El zorro marrón rápido salta sobre el perro perezoso.

**Russian:**  

Быстрая коричневая лиса прыгает через ленивую собаку.

**Greek:**  

Η γρήγορη καφέ αλεπού πηδάει πάνω από το τεμπέλικο σκυλί.

**Dutch:**  

De snelle bruine vos springt over de luie hond.

)";

std::string textdoc2 = R"(
long Text:
https://zh.wikipedia.org/wiki/C%2B%2B
C++是一种被广泛使用的计算机程序设计语言。它是一种通用程序设计语言,支持多重编程范式,例如过程化程序设计、面向对象程序设计、泛型程序设计和函数式程序设计等。

比雅尼·斯特劳斯特鲁普博士在贝尔实验室工作期间在20世纪80年代发明并实现了C++。起初,这种语言被称作“C with Classes”(“包含‘类’的C语言”),作为C语言的增强版出现。随后,C++不断增加新特性。虚函数、运算符重载、多继承、标准模板库、异常处理、运行时类型信息、命名空间等概念逐渐纳入标准草案。1998年,国际标准组织颁布了C++程序设计语言的第一个国际标准ISO/IEC 14882:1998(C++98),目前最新标准为ISO/IEC 14882:2024(C++23)。ISO/IEC 14882通称ISO C++。ISO C++主要包含了核心语言和标准库的规则。尽管从核心语言到标准库都有显著不同,ISO C++直接正式(normative)引用了ISO/IEC 9899(通称ISO C),且ISO C++标准库的一部分和ISO C的标准库的API完全相同,另有很小一部分和C标准库略有差异(例如,strcat等函数提供对const类型的重载)。这使得C和C++的标准库实现常常被一并提供,在核心语言规则很大一部分兼容的情况下,进一步确保用户通常较容易把符合ISO C的源程序不经修改或经极少修改直接作为C++源程序使用,也是C++语言继C语言之后流行的一个重要原因。

作为广泛被使用的工业语言,C++存在多个流行的成熟实现:GCC、基于LLVM的Clang以及Visual C++等。这些实现同时也是成熟的C语言实现,但对C语言的支持程度不一(例如,VC++对ANSI C89之后的标准支持较不完善)。大多数流行的实现包含了编译器和C++部分标准库的实现。编译器直接提供核心语言规则的实现,而库提供ISO C++标准库的实现。这些实现中,库可能同时包含和ISO C标准库的共享实现(如VC++的msvcrt);而另一些实现的ISO C标准库则是单独于编译器项目之外提供的,如glibc和musl。C++标准库的实现也可能支持多种编译器,如GCC的libstdc++库支持GCC的g++和LLVM Clang的clang++。这些不同的丰富组合使市面上的C++环境具有许多细节上的实现差异,因而遵循ISO C++这样的权威标准对维持可移植性显得更加重要。现今讨论的C++语言,除非另行指明,通常均指ISO C++规则定义的C++语言(虽然因为实现的差异,可能不一定是最新的正式版本)。

值得注意,和流行的误解不同,ISO C和ISO C++都从未明确要求源程序被“编译”(compile),而仅要求“翻译”(translate),因此从理论上来讲,C和C++并不一定是编译型语言。技术上,实现C和C++程序的单位是翻译单元(translation unit)。作为对比,Java语言规范中就明确要求Java程序被编译为字节码,明确存在编译单元(compilation unit)。实际上C和C++也存在REPL形式的解释器实现,如CINT和Cling。但因为传统上C和C++多以编译器实现,习惯上仍有一些混用,例如ISO C++中的编译期整数序列(Compile-time integer sequences)[2]。

传统上,C++语言被视为和C语言实现性能相近的语言,强调运行时的高效。根据《C++编程思想》(Thinking in C++)一书,C++与C的代码执行效率往往相差在±5%之间[3]。 
)";

struct RollingBuffer
{
    float Span;
    ImVector<ImVec2> Data;
    RollingBuffer()
    {
        Span = 10.0f;
        Data.reserve(2000);
    }
    void AddPoint(float x, float y)
    {
        float xmod = fmodf(x, Span);
        if (!Data.empty() && xmod < Data.back().x)
            Data.shrink(0);
        Data.push_back(ImVec2(xmod, y));
    }
};

void Delay(float const fps)
{
    long const delay = static_cast<long>(1e+6F / fps);
    auto start = std::chrono::high_resolution_clock::now();
    while (true)
    {
        auto stop = std::chrono::high_resolution_clock::now();
        auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);
        if (duration.count() >= delay)
        {
            return;
        }
    }
}

auto loadFont() -> ImFont *
{
    ImGuiIO &io = ImGui::GetIO();
    ImFont *font = nullptr;
    ImFontConfig fontCfg;
    font = io.Fonts->AddFontFromFileTTF(R"(C:\Windows\Fonts\arial.ttf)", 0.0F, &fontCfg);
    ImFontConfig fontCfg1;
    fontCfg1.MergeMode = true;
    fontCfg1.FontLoaderFlags |= ImGuiFreeTypeLoaderFlags_LoadColor;
    font = io.Fonts->AddFontFromFileTTF(R"(C:\Windows\Fonts\SimHei.ttf)", 0.0F, &fontCfg1);
    ImFontConfig fontCfgEmoj;
    font = io.Fonts->AddFontFromFileTTF(R"(C:\Windows\Fonts\seguiemj.ttf)", 0.0F, &fontCfg1);
    return font;
}

} // namespace
auto main() -> int
{
    // Init GLFW
    glfwInit();
    const char *glsl_version = "#version 130";
    glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3);
    glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 0);
    GLFWwindow *window = glfwCreateWindow(800, 600, "Minimal ImGui", nullptr, nullptr);
    glfwMakeContextCurrent(window);
    glfwSwapInterval(1); // Enable vsync

    // Init ImGui
    IMGUI_CHECKVERSION();
    ImGui::CreateContext();
    ImPlot::CreateContext();
    ImGuiIO &io = ImGui::GetIO();
    io.BackendFlags |= ImGuiBackendFlags_RendererHasVtxOffset;
    io.IniFilename = nullptr;
    // ImGui::StyleColorsDark(); // or Light()
    ImGui_ImplGlfw_InitForOpenGL(window, true);
    ImGui_ImplOpenGL3_Init(glsl_version);
    auto *font = loadFont();
    std::string *currentText = &textdoc1;
    // Main loop
    while (glfwWindowShouldClose(window) == 0)
    {
        glfwPollEvents();
        ImGui_ImplOpenGL3_NewFrame();
        ImGui_ImplGlfw_NewFrame();
        ImGui::NewFrame();

        int display_w = 0;
        int display_h = 0;
        glfwGetFramebufferSize(window, &display_w, &display_h);
        ImGui::Begin("Text Window", nullptr);
        if (ImGui::Button("Text1"))
        {
            currentText = &textdoc1;
        }
        ImGui::SameLine();
        if (ImGui::Button("Text2"))
        {
            currentText = &textdoc2;
        }
        ImGui::Text("Original");

        ImGui::PushFont(font, 10.0F);
        auto start = std::chrono::high_resolution_clock::now();
        RenderTextWrapped(currentText->c_str(), currentText->c_str() + currentText->size(), 0);
        auto duration1 =
            std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::high_resolution_clock::now() - start);
        ImGui::PopFont();
        ImGui::Text("Patched");
        ImGui::PushFont(font, 10.0F);
        start = std::chrono::high_resolution_clock::now();
        RenderTextWrapped(currentText->c_str(), currentText->c_str() + currentText->size(), 1);
        auto duration2 =
            std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::high_resolution_clock::now() - start);
        ImGui::PopFont();
        ImGui::Text("Cached");
        ImGui::PushFont(font, 10.0F);
        start = std::chrono::high_resolution_clock::now();
        RenderTextWrapped(currentText->c_str(), currentText->c_str() + currentText->size(), 2);
        auto duration3 =
            std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::high_resolution_clock::now() - start);
        ImGui::PopFont();
        ImGui::End();

        ImGui::Begin("realtime");
        static RollingBuffer rdata1, rdata2, rdata3;
        ImVec2 mouse = ImGui::GetMousePos();
        static float t = 0;
        t += ImGui::GetIO().DeltaTime;

        rdata1.AddPoint(t, float(duration1.count()));
        rdata2.AddPoint(t, float(duration2.count()));
        rdata3.AddPoint(t, float(duration3.count()));
        static float history = 10.0f;
        ImGui::SliderFloat("History", &history, 1, 30, "%.1f s");
        rdata1.Span = history;
        rdata2.Span = history;
        rdata3.Span = history;

        static ImPlotAxisFlags flags = 0;

        if (ImPlot::BeginPlot("##Rolling", ImVec2(-1, 600)))
        {
            ImPlot::SetupAxes(nullptr, nullptr, flags, flags);
            ImPlot::SetupAxisLimits(ImAxis_X1, 0, history, ImGuiCond_Always);
            ImPlot::SetupAxisLimits(ImAxis_Y1, 0, 1000);
            ImPlot::PlotLine(
                "Original", &rdata1.Data[0].x, &rdata1.Data[0].y, rdata1.Data.size(), 0, 0, 2 * sizeof(float));
            ImPlot::PlotLine(
                "Patched", &rdata2.Data[0].x, &rdata2.Data[0].y, rdata2.Data.size(), 0, 0, 2 * sizeof(float));
            ImPlot::PlotLine(
                "Cached", &rdata3.Data[0].x, &rdata3.Data[0].y, rdata3.Data.size(), 0, 0, 2 * sizeof(float));
            ImPlot::EndPlot();
        }

        ImGui::End();
        // Render
        ImGui::Render();

        glViewport(0, 0, display_w, display_h);
        glClearColor(0.1f, 0.1f, 0.1f, 1.0f);
        glClear(GL_COLOR_BUFFER_BIT);
        ImGui_ImplOpenGL3_RenderDrawData(ImGui::GetDrawData());
        glfwSwapBuffers(window);
        Delay(60.0);
    }

    // Cleanup
    ImGui_ImplOpenGL3_Shutdown();
    ImGui_ImplGlfw_Shutdown();
    ImPlot::DestroyContext();
    ImGui::DestroyContext();
    glfwDestroyWindow(window);
    glfwTerminate();
    return 0;
}

Cmake file FYI

set(APP_NAME basic_imgui)

file(GLOB IMGUI_SRC
    ${THIRDPP_DIR}/imgui/*.cpp
    ${THIRDPP_DIR}/imgui/backends/imgui_impl_glfw.cpp
    ${THIRDPP_DIR}/imgui/backends/imgui_impl_opengl3.cpp
)

file(
    GLOB "${APP_NAME}_SOURCES"
    ${CMAKE_CURRENT_SOURCE_DIR}/main.cpp
)

list(APPEND ${APP_NAME}_SOURCES
    ${APP_DIR}/dpi.manifest # can be omitted
    ${THIRDPP_DIR}/implot/implot.cpp
    ${THIRDPP_DIR}/implot/implot_items.cpp
)

add_executable(
    ${APP_NAME}
    ${${APP_NAME}_SOURCES}
    ${IMGUI_SRC}
)

target_compile_definitions(
    ${APP_NAME} PRIVATE
    IMGUI_ENABLE_FREETYPE_PLUTOSVG
    IMGUI_USE_WCHAR32
)

target_include_directories(
    ${APP_NAME} PRIVATE
    ${THIRDPP_DIR}/imgui
    ${THIRDPP_DIR}/imgui/backends
    ${THIRDPP_DIR}/implot
)
target_link_libraries(${APP_NAME} PRIVATE
    plutosvg
    glfw
    glad
)


xuboying avatar Nov 18 '25 16:11 xuboying

Hi again,

I ran additional performance tests and identified that the bottleneck seems to be related to this section of code:

else if (c && next_c)
{
    if (prev_c >= '0' && prev_c <= '9' && next_c >= '0' && next_c <= '9' && !ImCharIsLineBreakableW(c))
        continue;
    if (ImCharIsLineBreakableW(next_c) && !ImCharIsHeadProhibited(next_c) && !ImCharIsTailProhibited(c))
        IM_ADVANCE_WORD();
    if ((ImCharIsHeadProhibited(c) || ImCharIsLineBreakableW(c)) && !ImCharIsHeadProhibited(next_c))
        IM_ADVANCE_WORD();
}

I noticed two clear performance issues:

  1. Redundant checks
    Although the scan is one-pass, both ImCharIsLineBreakableW and ImCharIsHeadProhibited are evaluated multiple times for adjacent characters. To avoid this, I introduced cached variables:

    • next_char_is_line_breakable
    • char_is_line_breakable
    • next_char_is_head_prohibited
    • char_is_head_prohibited
  2. Hot path inefficiency
    Both ImCharIsHeadProhibitedW and ImCharIsLineBreakableW are on an extremely hot path but were implemented with O(n) complexity. I used Copilot to help rewrite these functions, and the optimized version provides a noticeable speedup.

With these changes, the performance looks significantly better.

Image
//
#include <array>
#include <chrono>
#include <string>
#include <unordered_set>

//
#include <GLFW/glfw3.h>
#include <imgui.h>
#include <imgui_impl_glfw.h>
#include <imgui_impl_opengl3.h>
#include <imgui_internal.h>
#include <implot.h>
#include <misc/freetype/imgui_freetype.h>

extern float BuildLoadGlyphGetAdvanceOrFallback(ImFontBaked *baked, unsigned int codepoint);
// clang-format off
namespace CalcWordWrapPatch {



//=====================
// --- Bitmap helpers ---
// Perfect hash / bitmap(by Copilot)
template <size_t N>
struct Bitmap {
    std::array<uint8_t, N> bits{};

    void set(size_t idx) {
        bits[idx] = 1;
    }

    bool test(size_t idx) const {
        return bits[idx] != 0;
    }
};

// --- Build bitmaps for dense ranges ---
static constexpr size_t RANGE_3000_30FF_SIZE = 0x100; // 256 code points
static Bitmap<RANGE_3000_30FF_SIZE> bitmap_3000_30FF{};

static constexpr size_t RANGE_31F0_31FF_SIZE = 0x10; // 16 code points
static Bitmap<RANGE_31F0_31FF_SIZE> bitmap_31F0_31FF{};

static constexpr size_t RANGE_FF61_FF70_SIZE = 0x10; // 16 code points
static Bitmap<RANGE_FF61_FF70_SIZE> bitmap_FF61_FF70{};

// --- Hash set for scattered values ---
static const std::unordered_set<unsigned int> HeadProhibitedScatter = {
    0x00A2, 0x00B0, 0x2032, 0x2033, 0x2030, 0x2103,
    0x201D, 0x3009, 0x300B, 0x300D, 0x300F, 0x3011, 0x3015,
    0xFF09, 0xFF3D, 0xFF5D, 0xFF63
    // add other scattered ones here
};

// --- Initialization (fill bitmaps) ---
struct InitBitmaps {
    InitBitmaps() {
        // Fill 0x3000–0x30FF bitmap
        unsigned int dense_3000_30FF[] = {
            0x3001, 0x3002, 0x3005, 0x303B, 0x30FB,
            0x309D, 0x309E, 0x30FD, 0x30FE, 0x30FC,
            0x30A1, 0x30A3, 0x30A5, 0x30A7, 0x30A9,
            0x30C3, 0x30E3, 0x30E5, 0x30E7, 0x30EE,
            0x30F5, 0x30F6
        };
        for (auto c : dense_3000_30FF) {
            bitmap_3000_30FF.set(c - 0x3000);
        }

        // Fill 0x31F0–0x31FF bitmap
        for (unsigned int c = 0x31F0; c <= 0x31FF; ++c) {
            bitmap_31F0_31FF.set(c - 0x31F0);
        }

        // Fill 0xFF61–0xFF70 bitmap
        for (unsigned int c = 0xFF61; c <= 0xFF70; ++c) {
            bitmap_FF61_FF70.set(c - 0xFF61);
        }
    }
};

// Static initializer
static InitBitmaps initBitmaps;

// --- Lookup function ---
inline bool ImCharIsHeadProhibitedW(unsigned int c)
{
    if (c >= 0x3000 && c <= 0x30FF) {
        return bitmap_3000_30FF.test(c - 0x3000);
    }
    if (c >= 0x31F0 && c <= 0x31FF) {
        return bitmap_31F0_31FF.test(c - 0x31F0);
    }
    if (c >= 0xFF61 && c <= 0xFF70) {
        return bitmap_FF61_FF70.test(c - 0xFF61);
    }
    return HeadProhibitedScatter.find(c) != HeadProhibitedScatter.end();
}

//------------------------------------------------------
// Branchless approach by (Copilot)
inline bool ImCharIsLineBreakableW(unsigned int c)
{
    return
        (unsigned)(c - 0x3040) <= (0x9FFF - 0x3040) ||
        (unsigned)(c - 0x20000) <= (0xDFFFF - 0x20000) ||
        (unsigned)(c - 0xAC00) <= (0xD7FF - 0xAC00) ||
        (unsigned)(c - 0xF900) <= (0xFAFF - 0xF900) ||
        (unsigned)(c - 0x1100) <= (0x11FF - 0x1100) ||
        (unsigned)(c - 0x2E80) <= (0x2FFF - 0x2E80);
}

//==================================================================



    inline bool             ImCharIsHeadProhibitedA(char c) { return c == ' ' || c == '\t' || c == '}' || c == ')' || c == ']' || c == '?' || c == '!' || c == '|' || c == '/' || c == '&' || c == '.' || c == ',' || c == ':' || c == ';';}
    const unsigned int      HeadProhibitedW[] = { 0xa2, 0xb0, 0x2032, 0x2033, 0x2030, 0x2103, 0x3001, 0x3002, 0xff61, 0xff64, 0xffe0, 0xff0c, 0xff0e, 0xff1a, 0xff1b, 0xff1f, 0xff01, 0xff05, 0x30fb, 0xff65, 0x309d, 0x309e, 0x30fd, 0x30fe, 0x30fc, 0x30a1, 0x30a3, 0x30a5, 0x30a7, 0x30a9, 0x30c3, 0x30e3, 0x30e5, 0x30e7, 0x30ee, 0x30f5, 0x30f6, 0x3041, 0x3043, 0x3045, 0x3047, 0x3049, 0x3063, 0x3083, 0x3085, 0x3087, 0x308e, 0x3095, 0x3096, 0x31f0, 0x31f1, 0x31f2, 0x31f3, 0x31f4, 0x31f5, 0x31f6, 0x31f7, 0x31f8, 0x31f9, 0x31fa, 0x31fb, 0x31fc, 0x31fd, 0x31fe, 0x31ff, 0x3005, 0x303b, 0xff67, 0xff68, 0xff69, 0xff6a, 0xff6b, 0xff6c, 0xff6d, 0xff6e, 0xff6f, 0xff70, 0x201d, 0x3009, 0x300b, 0x300d, 0x300f, 0x3011, 0x3015, 0xff09, 0xff3d, 0xff5d, 0xff63};

    inline bool             ImCharIsHeadProhibitedWOld(unsigned int c) { for (int i = 0; i < IM_ARRAYSIZE(HeadProhibitedW); i++) if (c == HeadProhibitedW[i]) return true; return false;}

    inline bool             ImCharIsHeadProhibited(unsigned int c)  { return (c < 128 && ImCharIsHeadProhibitedA(c)) || ImCharIsHeadProhibitedW(c); }

    inline bool             ImCharIsTailProhibitedA(unsigned int c) { return c == '(' || c == '[' || c == '{' || c == '+'; }

    const unsigned int      TailProhibitedW[] = { 0x2018, 0x201c, 0x3008, 0x300a, 0x300c, 0x300e, 0x3010, 0x3014, 0xff08, 0xff3b, 0xff5b, 0xff62, 0xa3, 0xa5, 0xff04, 0xffe1, 0xffe5, 0xff0b };

    inline bool             ImCharIsTailProhibitedW(unsigned int c) { for (int i = 0; i < IM_ARRAYSIZE(TailProhibitedW); i++) if (c == TailProhibitedW[i]) return true; return false; }

    inline bool             ImCharIsTailProhibited(unsigned int c)  { return (c < 128 && ImCharIsTailProhibitedA(c)) || ImCharIsTailProhibitedW(c); }

    inline bool             ImCharIsLineBreakableWOld(unsigned int c)  { return (c >= 0x3040 && c <= 0x9fff) || (c >= 0x3400 && c <= 0x4dbf) || (c >= 0x20000 && c <= 0xdffff) || (c >= 0x3040 && c <= 0x30ff) || (c >= 0xac00 && c <= 0xd7ff) || (c >= 0xf900 && c <= 0xfaff) || (c >= 0x1100 && c <= 0x11ff) || (c >= 0x2e80 && c <= 0x2fff); }




//=====================

// clang-format on

// Simple word-wrapping for English, not full-featured. Please submit failing cases!
// This will return the next location to wrap from. If no wrapping if necessary, this will fast-forward to e.g. text_end.
// FIXME: Much possible improvements (don't cut things like "word !", "word!!!" but cut within "word,,,,", more sensible support for punctuations, support for Unicode punctuations, etc.)
const char *CalcWordWrapPositionEx(ImFont *font, float size, const char *text, const char *text_end, float wrap_width)
{
    // Refactored word wrapping method detects wrapping points by looking for at most 3 consecutive characters.
    // (Currently only 2.)

    ImFontBaked *baked = font->GetFontBaked(size);
    const float scale = size / baked->Size;

    float line_width = 0.0f;
    float word_width = 0.0f;
    float blank_width = 0.0f;
    wrap_width /= scale; // We work with unscaled widths to avoid scaling every characters

    const char *word_end = text;

    const char *prev_s = NULL;
    const char *s = NULL;
    const char *next_s = text;
    unsigned int prev_c = 0;
    unsigned int c = 0;
    unsigned int next_c = 0;

    bool next_char_is_line_break_able = false;
    bool char_line_is_break_able = false;

    bool next_char_is_head_prohibited = false;
    bool char_is_head_prohibited = false;

#define IM_ADVANCE_WORD()                                                                                              \
    do                                                                                                                 \
    {                                                                                                                  \
        word_end = s;                                                                                                  \
        line_width += word_width + blank_width;                                                                        \
        word_width = blank_width = 0.0f;                                                                               \
    } while (0)
    IM_ASSERT(text_end != NULL);
    while (s < text_end)
    {
        // prev_s is the END of prev_c, which actually points to c
        // same for s and next_s.
        prev_s = s;
        s = next_s;
        prev_c = c;
        c = next_c;
        char_line_is_break_able = next_char_is_line_break_able;
        char_is_head_prohibited = next_char_is_head_prohibited;

        next_c = (unsigned int) *next_s;
        if (next_c < 0x80)
            next_s = next_s + 1;
        else
            next_s = next_s + ImTextCharFromUtf8(&next_c, next_s, text_end);
        if (next_s > text_end)
            next_c = 0;

        if (prev_s == NULL)
        {
            continue;
        }
        if (c < 0x20)
        {
            if (c == '\n')
            {
                line_width = word_width = blank_width = 0.0f;
                continue;
            }
            if (c == '\r')
                continue;
        }
        // Optimized inline version of 'float char_width = GetCharAdvance((ImWchar)c);'
        float char_width = (c < (unsigned int) baked->IndexAdvanceX.Size) ? baked->IndexAdvanceX.Data[c] : -1.0f;
        if (char_width < 0.0f)
            char_width = BuildLoadGlyphGetAdvanceOrFallback(baked, c);
        if (ImCharIsBlankW(c))
            blank_width += char_width;
        else
        {
            word_width += char_width + blank_width;
            blank_width = 0.0f;
        }
        // We ignore blank width at the end of the line (they can be skipped)
        if (line_width + word_width > wrap_width)
        {
            // Words that cannot possibly fit within an entire line will be cut anywhere.
            if (word_width < wrap_width)
                s = word_end;
            else
                s = prev_s;
            break;
        }

        if (!next_c)
        {
            IM_ADVANCE_WORD();
        }
        else if (c)
        {
            next_char_is_line_break_able = ImCharIsLineBreakableW(next_c);
            next_char_is_head_prohibited = ImCharIsHeadProhibited(next_c);
            if (prev_c >= '0' && prev_c <= '9' && next_c >= '0' && next_c <= '9' && !next_char_is_line_break_able)
                continue;
            if (next_char_is_line_break_able && !next_char_is_head_prohibited && !ImCharIsTailProhibited(c))
                IM_ADVANCE_WORD();
            if ((char_is_head_prohibited || char_line_is_break_able) && !next_char_is_head_prohibited)
                IM_ADVANCE_WORD();
            /*
            if (ImCharIsLineBreakableW(next_c) && !ImCharIsHeadProhibited(next_c) && !ImCharIsTailProhibited(c))
                IM_ADVANCE_WORD();
            if ((ImCharIsHeadProhibited(c) || ImCharIsLineBreakableW(c)) && !ImCharIsHeadProhibited(next_c))
                IM_ADVANCE_WORD();
*/
        }
    }
#undef IM_ADVANCE_WORD
    // Wrap_width is too small to fit anything. Force displaying 1 character to minimize the height discontinuity.
    // +1 may not be a character start point in UTF-8 but it's ok because caller loops use (text >= word_wrap_eol).
    if (s == text && text < text_end)
        return s + ImTextCountUtf8BytesFromChar(s, text_end);
    return s > text_end ? text_end : s;
}

} // namespace CalcWordWrapPatch
namespace
{
using namespace CalcWordWrapPatch;
inline auto CalcWordEx(ImFont *font, float size, const char *text, const char *text_end, float wrap_width, bool useEx)
    -> const char *
{
    return useEx ? CalcWordWrapPatch::CalcWordWrapPositionEx(font, size, text, text_end, wrap_width)
                 : font->CalcWordWrapPosition(size, text, text_end, wrap_width);
}
} // namespace

namespace cache
{

struct WrapKey
{
    ImFont *font;
    const char *text;
    const char *text_end;
    float size;
    float wrap_width;

    auto matches(ImFont *f, float s, const char *t, const char *t_end, float w) const -> bool
    {
        return font == f && size == s && text == t && text_end == t_end && wrap_width == w;
    }
};

struct WrapEntry
{
    WrapKey key;
    const char *result;
};

class WordWrapCache
{
public:
    WordWrapCache(size_t capacity, size_t reset_interval)
        : capacity_(capacity), reset_interval_(reset_interval), call_count_(0)
    {
    }

    auto get(ImFont *font, float size, const char *text, const char *text_end, float wrap_width) -> const char *
    {
        if (++call_count_ >= reset_interval_)
        {
            entries_.clear();
            call_count_ = 0;
        }

        for (ptrdiff_t i = 0; i < entries_.size(); ++i)
        {
            if (entries_[i].key.matches(font, size, text, text_end, wrap_width))
            {
                if (i != 0)
                {
                    std::rotate(entries_.begin(), entries_.begin() + i, entries_.begin() + i + 1);
                }
                return entries_[0].result;
            }
        }

        const char *result = CalcWordEx(font, size, text, text_end, wrap_width, true);
        WrapEntry entry{
            .key = {.font = font, .text = text, .text_end = text_end, .size = size, .wrap_width = wrap_width},
            .result = result
        };

        if (entries_.size() == capacity_)
        {
            // std::cerr << "full" << rand() << std::endl;
            capacity_ *= 2;
            entries_.pop_back();
        }
        entries_.insert(entries_.begin(), std::move(entry));

        return result;
    }

private:
    size_t capacity_;
    size_t reset_interval_;
    size_t call_count_;
    std::vector<WrapEntry> entries_;
};

} // namespace cache

namespace
{

// based on https://github.com/enkisoftware/imgui_markdown/blob/main/imgui_markdown.h
void RenderTextWrapped(
    const char *text_, const char *text_end_, int version /* 0:origional, 1: patched, 2:patched+cache */)
{
    static cache::WordWrapCache cache(/*capacity=*/10000, 100);
    auto *font = ImGui::GetFont();
    float fontSize = ImGui::GetFontSize();
    float widthLeft = ImGui::GetContentRegionAvail().x;
    const char *endLine = nullptr;
    if (version == 2)
    {
        endLine = cache.get(font, fontSize, text_, text_end_, widthLeft);
    }
    else
    {
        endLine = CalcWordEx(font, fontSize, text_, text_end_, widthLeft, bool(version));
    }
    ImGui::TextUnformatted(text_, endLine);
    widthLeft = ImGui::GetContentRegionAvail().x;
    while (endLine < text_end_)
    {
        text_ = endLine;
        if (*text_ == ' ')
        {
            ++text_;
        }
        if (version == 2)
        {
            endLine = cache.get(font, fontSize, text_, text_end_, widthLeft);
        }
        else
        {
            endLine = CalcWordEx(font, fontSize, text_, text_end_, widthLeft, bool(version));
        }
        if (text_ == endLine)
        {
            endLine++;
        }
        ImGui::TextUnformatted(text_, endLine);
    }
}

std::string textdoc1 = R"(


..................................................word word

// ZH-CN and EN, no space in between

字文字文字字文字文字文字文字字文字文字字文字文字文字文字word word文字文字文字文字文字文字文字文字

// ZH-CN and EN, space in between

字文字文字字文字文字文字文字 word word 字文字文字文字文字文字文。

// Emoj and EN no space in between

😁😁😁😁😁😁😁word word😁😁😁😁😁😁😁😁😁

// Emoj and EN space in between

😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁😁 word word 😁😁😁😁😁😁😁😁😁😁😁😁😁



### Multilingual Test Phrases (Word-Based Languages)

All the following languages are **word-based**. When wrapping text, the line break **must occur at the spaces between words**, and **must not split a word across two lines**.

**English:**  

The quick brown fox jumps over the lazy dog.

**French:**  

Le renard brun rapide saute par-dessus le chien paresseux.

**German:**  

Die schnell braune Fuchs springt über den faulen Hund.

**Spanish:**  

El zorro marrón rápido salta sobre el perro perezoso.

**Russian:**  

Быстрая коричневая лиса прыгает через ленивую собаку.

**Greek:**  

Η γρήγορη καφέ αλεπού πηδάει πάνω από το τεμπέλικο σκυλί.

**Dutch:**  

De snelle bruine vos springt over de luie hond.

)";

std::string textdoc2 = R"(
long Text:
https://zh.wikipedia.org/wiki/C%2B%2B
C++是一种被广泛使用的计算机程序设计语言。它是一种通用程序设计语言,支持多重编程范式,例如过程化程序设计、面向对象程序设计、泛型程序设计和函数式程序设计等。

比雅尼·斯特劳斯特鲁普博士在贝尔实验室工作期间在20世纪80年代发明并实现了C++。起初,这种语言被称作“C with Classes”(“包含‘类’的C语言”),作为C语言的增强版出现。随后,C++不断增加新特性。虚函数、运算符重载、多继承、标准模板库、异常处理、运行时类型信息、命名空间等概念逐渐纳入标准草案。1998年,国际标准组织颁布了C++程序设计语言的第一个国际标准ISO/IEC 14882:1998(C++98),目前最新标准为ISO/IEC 14882:2024(C++23)。ISO/IEC 14882通称ISO C++。ISO C++主要包含了核心语言和标准库的规则。尽管从核心语言到标准库都有显著不同,ISO C++直接正式(normative)引用了ISO/IEC 9899(通称ISO C),且ISO C++标准库的一部分和ISO C的标准库的API完全相同,另有很小一部分和C标准库略有差异(例如,strcat等函数提供对const类型的重载)。这使得C和C++的标准库实现常常被一并提供,在核心语言规则很大一部分兼容的情况下,进一步确保用户通常较容易把符合ISO C的源程序不经修改或经极少修改直接作为C++源程序使用,也是C++语言继C语言之后流行的一个重要原因。

作为广泛被使用的工业语言,C++存在多个流行的成熟实现:GCC、基于LLVM的Clang以及Visual C++等。这些实现同时也是成熟的C语言实现,但对C语言的支持程度不一(例如,VC++对ANSI C89之后的标准支持较不完善)。大多数流行的实现包含了编译器和C++部分标准库的实现。编译器直接提供核心语言规则的实现,而库提供ISO C++标准库的实现。这些实现中,库可能同时包含和ISO C标准库的共享实现(如VC++的msvcrt);而另一些实现的ISO C标准库则是单独于编译器项目之外提供的,如glibc和musl。C++标准库的实现也可能支持多种编译器,如GCC的libstdc++库支持GCC的g++和LLVM Clang的clang++。这些不同的丰富组合使市面上的C++环境具有许多细节上的实现差异,因而遵循ISO C++这样的权威标准对维持可移植性显得更加重要。现今讨论的C++语言,除非另行指明,通常均指ISO C++规则定义的C++语言(虽然因为实现的差异,可能不一定是最新的正式版本)。

值得注意,和流行的误解不同,ISO C和ISO C++都从未明确要求源程序被“编译”(compile),而仅要求“翻译”(translate),因此从理论上来讲,C和C++并不一定是编译型语言。技术上,实现C和C++程序的单位是翻译单元(translation unit)。作为对比,Java语言规范中就明确要求Java程序被编译为字节码,明确存在编译单元(compilation unit)。实际上C和C++也存在REPL形式的解释器实现,如CINT和Cling。但因为传统上C和C++多以编译器实现,习惯上仍有一些混用,例如ISO C++中的编译期整数序列(Compile-time integer sequences)[2]。

传统上,C++语言被视为和C语言实现性能相近的语言,强调运行时的高效。根据《C++编程思想》(Thinking in C++)一书,C++与C的代码执行效率往往相差在±5%之间[3]。 
)";

struct RollingBuffer
{
    float Span;
    ImVector<ImVec2> Data;
    RollingBuffer()
    {
        Span = 10.0f;
        Data.reserve(2000);
    }
    void AddPoint(float x, float y)
    {
        float xmod = fmodf(x, Span);
        if (!Data.empty() && xmod < Data.back().x)
            Data.shrink(0);
        Data.push_back(ImVec2(xmod, y));
    }
};

void Delay(float const fps)
{
    long const delay = static_cast<long>(1e+6F / fps);
    auto start = std::chrono::high_resolution_clock::now();
    while (true)
    {
        auto stop = std::chrono::high_resolution_clock::now();
        auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);
        if (duration.count() >= delay)
        {
            return;
        }
    }
}

auto loadFont() -> ImFont *
{
    ImGuiIO &io = ImGui::GetIO();
    ImFont *font = nullptr;
    ImFontConfig fontCfg;
    font = io.Fonts->AddFontFromFileTTF(R"(C:\Windows\Fonts\arial.ttf)", 0.0F, &fontCfg);
    ImFontConfig fontCfg1;
    fontCfg1.MergeMode = true;
    fontCfg1.FontLoaderFlags |= ImGuiFreeTypeLoaderFlags_LoadColor;
    font = io.Fonts->AddFontFromFileTTF(R"(C:\Windows\Fonts\SimHei.ttf)", 0.0F, &fontCfg1);
    ImFontConfig fontCfgEmoj;
    font = io.Fonts->AddFontFromFileTTF(R"(C:\Windows\Fonts\seguiemj.ttf)", 0.0F, &fontCfg1);
    return font;
}

} // namespace
auto main() -> int
{
    // Init GLFW
    glfwInit();
    const char *glsl_version = "#version 130";
    glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3);
    glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 0);
    GLFWwindow *window = glfwCreateWindow(800, 600, "Minimal ImGui", nullptr, nullptr);
    glfwMakeContextCurrent(window);
    glfwSwapInterval(1); // Enable vsync

    // Init ImGui
    IMGUI_CHECKVERSION();
    ImGui::CreateContext();
    ImPlot::CreateContext();
    ImGuiIO &io = ImGui::GetIO();
    io.BackendFlags |= ImGuiBackendFlags_RendererHasVtxOffset;
    io.IniFilename = nullptr;
    // ImGui::StyleColorsDark(); // or Light()
    ImGui_ImplGlfw_InitForOpenGL(window, true);
    ImGui_ImplOpenGL3_Init(glsl_version);
    auto *font = loadFont();
    std::string *currentText = &textdoc1;
    // Main loop
    while (glfwWindowShouldClose(window) == 0)
    {
        glfwPollEvents();
        ImGui_ImplOpenGL3_NewFrame();
        ImGui_ImplGlfw_NewFrame();
        ImGui::NewFrame();

        int display_w = 0;
        int display_h = 0;
        glfwGetFramebufferSize(window, &display_w, &display_h);
        ImGui::Begin("Text Window", nullptr);
        if (ImGui::Button("Text1"))
        {
            currentText = &textdoc1;
        }
        ImGui::SameLine();
        if (ImGui::Button("Text2"))
        {
            currentText = &textdoc2;
        }
        ImGui::Text("Original");

        ImGui::PushFont(font, 10.0F);
        auto start = std::chrono::high_resolution_clock::now();
        RenderTextWrapped(currentText->c_str(), currentText->c_str() + currentText->size(), 0);
        auto duration1 =
            std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::high_resolution_clock::now() - start);
        ImGui::PopFont();
        ImGui::Text("Patched");
        ImGui::PushFont(font, 10.0F);
        start = std::chrono::high_resolution_clock::now();
        RenderTextWrapped(currentText->c_str(), currentText->c_str() + currentText->size(), 1);
        auto duration2 =
            std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::high_resolution_clock::now() - start);
        ImGui::PopFont();
        ImGui::Text("Cached");
        ImGui::PushFont(font, 10.0F);
        start = std::chrono::high_resolution_clock::now();
        RenderTextWrapped(currentText->c_str(), currentText->c_str() + currentText->size(), 2);
        auto duration3 =
            std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::high_resolution_clock::now() - start);
        ImGui::PopFont();
        ImGui::End();

        ImGui::Begin("realtime");
        static RollingBuffer rdata1, rdata2, rdata3;
        ImVec2 mouse = ImGui::GetMousePos();
        static float t = 0;
        t += ImGui::GetIO().DeltaTime;

        rdata1.AddPoint(t, float(duration1.count()));
        rdata2.AddPoint(t, float(duration2.count()));
        rdata3.AddPoint(t, float(duration3.count()));
        static float history = 10.0f;
        ImGui::SliderFloat("History", &history, 1, 30, "%.1f s");
        rdata1.Span = history;
        rdata2.Span = history;
        rdata3.Span = history;

        static ImPlotAxisFlags flags = 0;

        if (ImPlot::BeginPlot("##Rolling", ImVec2(-1, 600)))
        {
            ImPlot::SetupAxes(nullptr, nullptr, flags, flags);
            ImPlot::SetupAxisLimits(ImAxis_X1, 0, history, ImGuiCond_Always);
            ImPlot::SetupAxisLimits(ImAxis_Y1, 0, 1000);
            ImPlot::PlotLine(
                "Original", &rdata1.Data[0].x, &rdata1.Data[0].y, rdata1.Data.size(), 0, 0, 2 * sizeof(float));
            ImPlot::PlotLine(
                "Patched", &rdata2.Data[0].x, &rdata2.Data[0].y, rdata2.Data.size(), 0, 0, 2 * sizeof(float));
            ImPlot::PlotLine(
                "Cached", &rdata3.Data[0].x, &rdata3.Data[0].y, rdata3.Data.size(), 0, 0, 2 * sizeof(float));
            ImPlot::EndPlot();
        }

        ImGui::End();
        // Render
        ImGui::Render();

        glViewport(0, 0, display_w, display_h);
        glClearColor(0.1f, 0.1f, 0.1f, 1.0f);
        glClear(GL_COLOR_BUFFER_BIT);
        ImGui_ImplOpenGL3_RenderDrawData(ImGui::GetDrawData());
        glfwSwapBuffers(window);
        Delay(60.0);
    }

    // Cleanup
    ImGui_ImplOpenGL3_Shutdown();
    ImGui_ImplGlfw_Shutdown();
    ImPlot::DestroyContext();
    ImGui::DestroyContext();
    glfwDestroyWindow(window);
    glfwTerminate();
    return 0;
}

xuboying avatar Nov 19 '25 07:11 xuboying

I made #8838 only as a conceptual proof and did not bother to improve its bad performance (I'm aware of that though). Thank you reviewing my code and nice job too!

cyx2015s avatar Nov 19 '25 08:11 cyx2015s

Adding some comment for anyone who will continue review or improve the code. And as IM_ADVANCE_WORD is idempotent, if xxx IM_ADVANCE_WORD(); if yyy IM_ADVANCE_WORD(); should be changed to if-elseif block.

        // prev_s is the END of prev_c, which actually points to c
        // same for s and next_s.

          next_char_is_line_break_able = ImCharIsLineBreakableW(next_c);
          next_char_is_head_prohibited = ImCharIsHeadProhibited(next_c);
          if (!next_char_is_line_break_able && prev_c >= '0' && prev_c <= '9' && next_c >= '0' && next_c <= '9')
              continue;
          if (next_char_is_line_break_able && !next_char_is_head_prohibited && !ImCharIsTailProhibited(c))
              IM_ADVANCE_WORD();   // should handle word-based + Mix word-based and CJK , e.g word, // word. //  word文字
          else if ((char_is_head_prohibited || char_line_is_break_able) && !next_char_is_head_prohibited)
              IM_ADVANCE_WORD();   // should handle pure CJK part

xuboying avatar Nov 19 '25 15:11 xuboying

This is already quite complex, please turn it into a pull request, and if you can, try to preserve history in the PR. Honestly the only reasonable chance we can move this forward is if you start by developping tests for the imgui_test_suite. The closest tests for reference are widgets_text_wrapped_2.

ocornut avatar Nov 19 '25 16:11 ocornut

Hi Ocornut,

I’ll work on preparing a PR and adding test cases. Let me summarize the current status:

  1. PR #8838 provides a good baseline, and its history should be preserved.
  2. Additional documentation and comments should be added to #8838 to make the code easier to understand.
  3. The performance issues already identified in #8838 need to be addressed.
  4. New test cases should be added to the imgui_test_suite.

For the test case, since the platform is Win32, can I assume that C:\Windows\Fonts\SimHei.ttf will be available (with the Chinese Language Pack installed)?

Br

xuboying avatar Nov 22 '25 08:11 xuboying

We can rework test suite to load a Chinese font which may be adjusted based on eg system. BUT the test case shouldn’t depend on a specific font or size. Even if the code points are missing in the font, we process their value and should word wrap based on the same logic.

The tests should be eg submit eg: sequence ABCDEF with XX available width and verify the wrapping points. The tests can display the matching output but actual test probably only done by calling the Calc function.

For me the test cases are a much more important and lasting reference than the code. Please understand that there is a big chance that I may consider the code too slow or too complex. Only if have thorough tests I can consider working from that base to find the right balance.

I must also state that I will likely not put energy reviewing this if it use std:: functions or if the code doesn’t align with Dear ImGui standards.

Please also note that it is unlikely that I can merge this soon. I think the higher level api will probably need a way to select wrapping mode before.

TL;DR: you can treat the code as a way to exercise and validate the tests, but the tests are potentially more important than the code. The knowledge encoded in the tests is the knowledge that is more difficult for me to obtain.

ocornut avatar Nov 22 '25 10:11 ocornut

You’re right — widgets_text_wrapped_2 is cleverly designed to use relative sizing, so it doesn’t rely on any specific font or size. In theory, it should still work even if certain code points are missing.

As an optional improvement, if we want test cases to reliably display characters on any system without depending on whether a CJK font is installed, one approach would be to embed a subset font from an existing CJK TTF font file. This reduces the font size from megabytes down to kilobytes. A good candidate is Source Han Sans, and tools like fonttools (https://gist.github.com/xuboying/ce26c975ed82f13e35dc4d4f0fc0af0e) can be used to generate such subsets.

I’m not pushing a code PR right now. The original issue is actually resolved by the merged code with limited modification to imgui framework(exposing BuildLoadGlyphGetAdvanceOrFallback function). I’ll prioritize investigating the test case first.

xuboying avatar Nov 22 '25 19:11 xuboying

Hi again,

I’ve completed the test cases — please take a look when you have time.
The new tests are named widgets_text_wrapped_cjk_XXX, and they exercise the function ImGui::GetFont()->CalcWordWrapPositionCJK.

Related commits:

  • https://github.com/ocornut/imgui/compare/master...xuboying:imgui:master
  • https://github.com/ocornut/imgui_test_engine/compare/main...xuboying:imgui_test_engine:main

Notes:

  1. A TTF font was added to ensure proper display in cjk_3, cjk_4, and cjk_5.
  2. The test case currently contains some redundant code for calculating UTF‑8 byte offsets. It may be worth considering whether this logic could be factored into a helper function.
  3. Code style follows the existing context — generally C++03 without STL usage.

xuboying avatar Nov 24 '25 14:11 xuboying

Thanks @xuboying. I honesty don't know when I would be able to look at all, but it seems like good and careful work, I appreciate it.

ocornut avatar Nov 24 '25 14:11 ocornut

No worries — we can keep this issue open and gather feedback from other users as it comes in. I appreciate your feedback, and I’ll continue refining things as time allows.

xuboying avatar Nov 24 '25 15:11 xuboying