imgui Feature: Kerning

Above image is an example of kerning adjustments, added randomly just for demonstration.

It has just been a pet peeve of mine when a font that features kerning information gets displayed without kerning adjustments applied, so I wanted to implement the feature.

Coding styles aren't normalized at the moment - I would like to hear your opinions on using binary search and sort to manage kerning distance information as soon as possible. Things will get better as I work on this more.

TODO: Read kerning information from TTF file.

Feb 28 '22 15:02 Soreepeong

Hello,

I may not have time to provide good feedback but I think this is generally eventually desirable. You'll need to profile things but right now it looks slower than necessary.

Random thoughts:

If the table was sorted by Right glyph or Left glyph, the corresponding glyph could store an offset+size into the sorted kerning table, reducing amount of data to touch/search through (only touch pairs relating to 1 glyph).
Inner loops shouldn't decode UTF-8 twice, so I believe the kerning offset should be done from the Right-side glyph.
GetDistanceForPair() probably don't need to be called if there's no kerning.
Testing !KerningPairsAreSorted + Sorting can probably be done in e.g. NewFrame() and not in GetDistanceForPair().
Special handling for blank chars (if required, but I doubt so and I believe this is a separate feature?) seemingly can be baked into glyph>AdvanceX.
If font->KerningPairs is in the hot path (inner loop) it needs to be positioned accordingly in the ImFont structure, but change proposed above (store kerning offset+size into in glyph) may change it so ->KerningPairs (the VECTOR struct itself) doesn't need to be touched by hot path.
Generally need to be mindful about hot/cold field concepts.

Feb 28 '22 16:02 ocornut

Thank you for the thoughts! I've applied your suggestions.

I've changed KerningPairs to be sorted by right character first, and added IndexKernPairCount as a hot field of ImFont.
- IndexKernPairOffset probably counts as a cold field, as it will only be accessed if the value from IndexKernPairCount is not zero.
- GetDistanceAdjustmentForPair will binary search in [IndexKernPairOffset[c], IndexKernPairOffset[c] + IndexKernPairCount[c]).
I've added prev_c to all functions dealing with iterative kerning distance adjustments.
GetDistanceAdjustmentForPair will be called only if corresponding item exists in IndexKernPairCount.
It has been moved to ImGui::NewFrame, and it will call ImFont::BuildKerningPairs (should be inlined on release builds) which will call ImFont::BuildKerningPair_Build if DirtyKerningPairs is nonzero.
- DirtyKerningPairs is not a boolean, but rather an int, because it would be more efficient to record all the pending changes, then take only the newest change, rather than checking a corresponding pair already exists, and to implement that without a stable sort function, we need a variable that keeps track of original order.
- Otherwise, DirtyKerningPairs can be changed to a boolean member, and add ImStableSort of some sort.
Some fonts, like Times New Roman, do have kerning pairs that involve ' '(space), so not giving it special treatment should work.
- To prevent kerning from being applied when that said space is being used as a line break point, prev_c will be reset to 0.
- If you meant something else, please let me know.
EDIT: I've forgot to mention that I've changed the behavior of text measurement. For the last character of each line, the width of said line will be sum(advance width of all characters except last) + (width of last character) instead of sum(advance width of all characters), since the current behavior was resulting in something like the following:

EDIT: Decisions to make

Should ImStableSort exist, or is having order of insertion stored in kerning struct acceptable?
Does the code fetching kerning information from ttf data deserve a separate .c file, since it will introduce a lot of structures that will only be used in one file?

Obtaining kerning information

For preview purposes, it will brute force all possible pairs of codepoints to figure out the kerning distances. This has to be changed.

I'm referring to the following to fetch the kerning distance information:

http://formats.kaitai.io/ttf/ttf.svg
https://docs.microsoft.com/en-us/typography/opentype/spec/kern
https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6kern.html

And this is going to introduce a lot of structs. Would it be okay to make a new .c file ultimately consisting of void AddKerningInformationToImFont(ImFont* target, const void* ttf_data, int ttf_data_size)?

Feb 28 '22 19:02 Soreepeong

I'm using briefly skimming through this, but why not storing the Offset/Count simply the same table as IndexAdvanceX[] ? (need a new structure name). I would probably suggest using a bit field e.g. Offset : 20, Count : 12 with an assert during build if Count doesn't fit for one given glyph.

The hot code ihmo should go from:

const float char_width = ((int)c < IndexAdvanceX.Size ? IndexAdvanceX.Data[c] : FallbackAdvanceX) * scale;

to

if (c < IndexAdvanceAndKerning.Size)
{
   ImFontGlyphHotData* data = &IndexAdvanceAndKerning.Data[c];
   char_width = data ->Width;
   if (data->KerningCount != 0)
        char_width += KerningLookup(prev_c, c, data);
}
else
{
    ..
}
char_width *= scale;

Any extra overhead would be undesirable.

(5) I meant given the first version of the code, the ImMax(glyph->X1 - glyph->X0, glyph->AdvanceX) could simply be baked/stored into glyph->AdvanceX ?
(5b) Not sure of the rationale behind the if (s == text_end .. part of the test.
In CalcTextSizeA() the * scale multiplier got lost in a recent commit.
Kerning should be easy to disable at runtime globally, to facilitate debugging.
Kerning pairs could be all built when the font is build ? Why is there a need to iterate on kerning data building?

Does the code fetching kerning information from ttf data deserve a separate .c file, since it will introduce a lot of structures that will only be used in one file?

No it should be in imgui_draw.cpp but I don't understand which structure you are talking about. Both stb_truetype and FreeType expose this information, so I am not sure what you are asking about?

Feb 28 '22 20:02 ocornut

I'll change that into the following:

struct ImFontGlyphHotData {
    float AdvanceX;
    unsigned int KerningPairOffset : 20;
    unsigned int KerningPairCount : 12;
}

For ImMax(glyph->X1 - glyph->X0, glyph->AdvanceX), it's the special case for the last character when measuring the width of a line. Last character may have an advance width shorter than its bounding width, and we do not want the measured width of the line to exclude a few last pixels. I believe that the width of a line should be calculated by sum(advance width for all characters in the line except the last character) + sum(all the kern adjustments) + max(bounding width of the last character, advance width of the last character).
- Some characters may feature an advance width that is shorter than its bounding width, like the above drawing. If there are multiples of that character, if we count in advance width only, then the measured width of the text will be shorter than how much width it'll actually take. The last character's bounding width has to be taken into consideration.
- BoundingWidth can be introducted into above structure, since this conditional codepath will be taken at least once for every line rendered.
I'll make sure to test with different scale configuration.
ImGuiConfigFlags_ looks like the best place to put this in. Add ImGuiConfigFlags_DisableKerning?
Users may call AddCustomRectFontGlyph, then they may decide that they want to set custom kerning distance between specific characters. Since it might be replacing the information from source font files, kerning distances have to be dealing with modifications.
- That function has to be called before building fonts anyway, so BuildLookupTable should be a good place to do kerning stuff, instead of NewFrame.
They both expose information, in the form of GetDistance(character1, character2) -> int, but they do not expose information in the form of GetAllKerningDistances() -> map<pair<char, char>, int>. If we start adding CJK characters into consideration, the enumeration loop may get quite lengthy, and it would be better to read the list, rather than querying for the existence of all possible pairs. Manually parsing TTF file and obtaining information would be trivial, but I am worried that it might be quite long, due to the sheer number of fields in structs that are unnecessary for our purposes, but still needs to be there for correct offsets.

Feb 28 '22 20:02 Soreepeong

ImGuiConfigFlags_NoKerning and corresponding entry in demo window has been added.
ImFontKerningPair was rid of DirtyIndex, and it will stable sort on BuildLookupTable instead.
- BuildKerningTable has been merged to BuildLookupTable.
ImFontGlyphHotData was added and AdvanceXAsLastCharacter was added, which will be referenced every time text measurement has reached the last character of the line.
- This replaces IndexAdvanceX.
- FallbackAdvanceX has been also removed, and FallbackChar has been promoted to hot member.

But then again, changes so far will result in 50% more time taken for rendering, when tested with 10000 lines from "Example: Long text display / not clipped (slow)" in Debug mode, and 85% more time taken when tested with 300k lines in Release mode. I'll try to find out what can be done to reduce render time taken further.

Feb 28 '22 22:02 Soreepeong

I have done some optimizations. It did greatly reduce the render time taken per frame from +85% to +46% (+13% with NoKerning set). Would you consider that this is an acceptable render loop overhead?

Also, adding kerning pairs from TTF file is now implemented in ImFontAtlasBuildWithStbTruetype. It will add all kerning pairs to ImFont as long as associated codepoint has been registered. It went simpler than expected, so any separate .c file won't be necessary.

Tests

Ran the following in Release mode with Visual Studio 2022 (DirectX 11, x64).

for (int i = 0; i < 100000; i++)
    ImGui::TextUnformatted("AC AG AT AV AW AY LT LV LW LY TA Ta Tc Td Te Tg To VA Va Vc Vd Ve Vg Vm Vo Vp Vq Vu WA We Wq YA Ya Yc Yd Ye Yg Ym Yn Yo Yp Yq Yr Yu eT oT");

Render time taken per frame (using FallbackChar+IndexedHotData as hot field)

Without this PR: 28.5ms
With NoKerning set: 32.1ms
Without NoKerning set: 41.5ms
Without making a frequently visited kern table for characters in ASCII range: 97.8ms
- Since non-English-alphabet characters are much less likely to have kerning information, allocating a ImVector of 128x128 to map two ASCII range characters to kern distance should be an acceptable memory overhead.

Mar 01 '22 07:03 Soreepeong

EDIT: It seems that my desktop's performance isn't really stable at the moment, as trying the above results in something not much different from the below, so maybe it's "good enough"? Still, if you see anything that can be optimized further, do let me know. Thanks!

~~Adapting the code for password_font required FallbackHotData, since it seemed that password font vectors needed to be empty. But this has brought some performance penalty, and this has to be fixed if possible. Sadly I'm out of ideas at the moment - any inputs would be appreciated.~~

Render time taken per frame (using FallbackHotData as hot field)

~~With NoKerning set: 41.5ms (46% more time taken than before-PR)~~
~~Without NoKerning set: 51.3ms (80% more time taken than before PR)~~

Mar 01 '22 14:03 Soreepeong

Thanks. I probably won't have the bandwidth to be looking at this for a long while unfortunately, but we have a good base.

Be mindful I am quickly skimming through things, suggestions:

Since GetDistanceAdjustmentForPairXXX() are now pointing to offset/count based on right-side character none of those functions need to check the ->Right value. In fact Right could even be omitted from the function parameter.
Calling GetDistanceAdjustmentForPair() with c_info instead of c_info->KerningPairOffset, c_info->KerningPairOffset + c_info->KerningPairCount - 1 would probably make things a little faster on both sides.
Given the likely values for KerningPairCount (some anecdotal stats may be welcome) I suspect once you do that, a dumb linear search may simply be faster and that code will be simpler too.

The ExtraForMaxOccupyWidth is the remaining most bothering thing but at this point I don't understand the problem well enough to comment (and I haven't read your posts carefully enough yet).

A few minor things:

use the g local variable instead of GImGui when already available.
Few occurrences of misplaced braces (ImFont_BuildLookupTable_ImFontKerningPairWithOrder)

Mar 01 '22 18:03 ocornut

I've applied those changes, and they do sound like great ideas! For linear search/bisect, I've made it into a flag, so that it will linear search if there are less than 32 associated pairs for the right char, and otherwise use bisect.

`ExtraForMaxOccupyWidth`

It can be helpful when you're trying to make faux italics. Faux italics won't change advance width, but it will change bounding width of a glyph, which will result in ExtraForMaxOccupyWidth value of above zero.

For expository purposes, I've added commit 75a23266. It must be removed when this pull request is about to be merged. It has some additional stuff in main.cpp and code that faux italicize the first font in imgui_impl_dx11.cpp.

Without `ExtraForMaxOccupyWidth`

You can see that CalcTextSizeA being done from AdvanceX only will result in actual text overflowing measured word wraps, and you can also see that button's text has been clipped.

With `ExtraForMaxOccupyWidth`

This will make up for the difference in bounding width and advance width of a glyph, for the last glyph of each line. Note that this implementation is assuming that second-from-last glyph's right edge is never past last glyph's right edge.

Mar 02 '22 02:03 Soreepeong

Hello @Soreepeong,

Sorry for my late answer. I've been currently at the state of prototyping new text functions and evaluating different paths/features for text rendering. Among them of course I would consider kerning as an option. It is unlikely I can merge the PR as-is because much code will change by the time I rewrite the functions, but I should be able to use this as a base reference.

As I was authorized to push to the branch I have taken the liberty to rebase it on latest now. I also added a simple KerningPair browser in DebugNodeFont() as an additional commit.

Some suggestions I have - if you have time and energy to push this to help:

Add some high-level helper functions/macros for external code needing to access that data. One prime example are the several changes related to InputText(): we should aim for those changes to less verbose and call helpers. <-- this is the most important for me as I can more easily plan adding opt-in kerning support in some of the new rendering loops.
You can try to implement this with the imgui_freetype builder. I think this may lead to a little bit of healthy tidying up of the kerning-related build functions so they can be used from a second spot.
Out of curiosity, what is the purpose of the test/demo code in imgui_impl_dx11 to slant characters? What that a separate experiment or is there some logic that it helps/facilitate the testing of kerning logic?

Thank you!

Jan 13 '23 10:01 ocornut

I'll try to check this out before this week passes. For the last bullet point, that's to exaggerate how the current imgui would measure the text width if the bitmap width exceeds the glyph's horizontal occupying space, like the image attached, and thus adding ExtraForMaxOccupyWidth.

Jan 16 '23 13:01 Soreepeong

Thanks for all this kerning stuff. It's what i need. Bump to persuade this pull to be accepted! So. I manually merged this changes with today's imgui master.

It works!

But there are a few problems.

Kern table

It cant read TTF GPOS kerning and relies on the old "kern" table. I'm trying with Roboto and it didn't work.

So I managed to regenerate the font using fontforge and selecting,

file -> generate -> ttf -> options, check "old style kern" and "Windows-compatible kern" and "prefer native kerning"

It will emit kern tables.

Kerning Values

Seem to way too high. This could well be my hack in (1). I'm seeing characters crashed together. I got sensible results by dividing all values by 3. Perhaps there's some other scaling factor not being used here.

Too Many Pairs

Currently the code just adds all the kern pairs it finds, but when i crate my imgui font, i give it a glyph range. It would be nice to only include those.

For now, i just include ASCII pairs, which is a horrible hack, but it's mostly ok.

Thanks.

Apr 13 '23 16:04 voidware

I actually have a code that will export kerning information from GPOS tables too at here, and usage.

Since I wrote this code, I'm fine with this code going into imgui, but not sure if we actually want to do that because TTF is a pretty complicated format and there is a high chance of buffer overflow happening if I didn't check boundaries properly. Do you want to add GPOS parsing code?

Apr 13 '23 16:04 Soreepeong

I would like to. It might fix my scaling problem and the fonts I'm using are from google and they all have GPOS kern data.

Apr 13 '23 16:04 voidware

imgui imgui copied to clipboard

Feature: Kerning

EDIT: Decisions to make

Obtaining kerning information

Tests

Render time taken per frame (using FallbackChar+IndexedHotData as hot field)

~~Render time taken per frame (using FallbackHotData as hot field)~~

ExtraForMaxOccupyWidth

Without ExtraForMaxOccupyWidth

With ExtraForMaxOccupyWidth

imgui
imgui copied to clipboard

Render time taken per frame (using FallbackHotData as hot field)

`ExtraForMaxOccupyWidth`

Without `ExtraForMaxOccupyWidth`

With `ExtraForMaxOccupyWidth`