emoji.wpf icon indicating copy to clipboard operation
emoji.wpf copied to clipboard

Emoji processing is very slow

Open GLEB-M opened this issue 1 year ago • 11 comments

FlowDocumentExtensions.cs, SubstituteGlyphs method

        while (cur.CompareTo(range_end) < 0)
        {
            TextPointer next = cur.GetNextInsertionPosition(LogicalDirection.Forward);
            if (next == null)
                break;

            string replace_text = null;
            var replace_range = new TextRange(cur, next);
            if (replace_range.Text.Length > 0 && EmojiData.MatchStart.Contains(replace_range.Text[0]))  

This code is very slow! If only 100 or more characters are printed, lag occurs.

replace_range.Text - getting text from range for each character is not fast operation for richedit What about optimizing this?

GLEB-M avatar Nov 25 '22 11:11 GLEB-M

I tried improving the substitution logic in febfcdde00c5aab8b3b782e83a8d8829f263b6c2. Can you maybe give it a try?

samhocevar avatar Jan 03 '23 15:01 samhocevar

Would it make sense to first check the entire string for a known pattern (if one exists) and only when found, then loop through the string?

mike-ward avatar Jan 03 '23 19:01 mike-ward

Another possible option would be to store a hash of strings encountered that have no emoji sequences.

mike-ward avatar Jan 03 '23 19:01 mike-ward

I tried improving the substitution logic in febfcdd. Can you maybe give it a try?

Initially, I thought that the problem was in rendering, but after debugging, I realized that this was not the case. Iterating through the richedit characters and getting the range is very slow! I spent some time thinking that this can be done simply, but then I realized that this is not the case and I need to radically change the logic for replacing text with emoji. But my attempts were not successful due to the nuances of processing emoji, etc.

GLEB-M avatar Jan 04 '23 07:01 GLEB-M

Yes, TextRange is really too high level for what I am doing. I did not realise the approach would be so slow. But I believe several optimisations are possible:

  • perform asynchronous Emoji replacements (won’t help with overall CPU usage, but will improve responsiveness)
  • do not call SubsituteGlyphsInRange on text runs that have already been processed
  • read more than one character at a time when scanning for Emoji in a Run or FlowDocument.

samhocevar avatar Jan 04 '23 21:01 samhocevar

Hi, it's not perfect yet, but I modified the SubstituteGlyphsInRange method so that it no longer iterates through the entire string of characters. It only processes the emojis.

private static readonly Regex emojiRegex = new Regex(EmojiData.MatchOne.ToString(), RegexOptions.None);

internal static void SubstituteGlyphsInRange(TextRange range, double default_font_size, Brush default_foreground, DependencyObject parent, SubstituteOptions options)
{
	// Get the parent RichTextBox
	RichTextBox rtb = parent as RichTextBox;

	// Check if ColonSyntax option is enabled
	var colon_syntax = (options & SubstituteOptions.ColonSyntax) != 0;

	// Check if ColorBlend option is enabled
	var color_blend = (options & SubstituteOptions.ColorBlend) != 0;

	// Get the caret position in the RichTextBox
	TextPointer caret = rtb?.CaretPosition;

	// Get the text within the specified text range
	string text = range.Text;

	// Search for emoji matches in the text using the emojiRegex regular expression
	var matches = emojiRegex.Matches(text);

	// Iterate over each match found
	foreach (Match match in matches)
	{
		// Get the start and end position of the match
		var start = range.Start.GetPositionAtOffset(match.Index);
		var end = start.GetPositionAtOffset(match.Length);

		// Create a text range to replace with the emoji
		var replace_range = new TextRange(start, end);
		
		// Get the text of the emoji
		var replace_text = match.Value;

		// Check if the caret is after the start of the range to be replaced
		bool caret_was_next = caret != null && start.CompareTo(caret) < 0 && end.CompareTo(caret) >= 0;

		// Get the font size and foreground color of the text range to be replaced
		var font_size = replace_range.GetPropertyValue(TextElement.FontSizeProperty);
		var foreground = replace_range.GetPropertyValue(TextElement.ForegroundProperty);

		// Replace the text range with an EmojiInline
		replace_range.Text = "";
		Inline inline = new EmojiInline(start)
		{
			FontSize = (double)(font_size ?? default_font_size),
			Foreground = color_blend ? (Brush)(foreground ?? default_foreground) : Brushes.Black,
			Text = replace_text,
		};

		// If the caret was after the start of the range to be replaced, update its position after the emoji insertion
		if (caret_was_next)
			caret = inline.ContentEnd;
	}

	// If the parent RichTextBox is not null, update the caret position
	if (rtb != null)
		rtb.CaretPosition = caret;
}

The first search for an emoji is a bit slow. But after that, regardless of the size of the text, it remains fast.

Swindler95 avatar Feb 14 '23 14:02 Swindler95

Hi, Swindler95

var matches = emojiRegex.Matches(text); // where is object emojiRegex?

GLEB-M avatar Feb 14 '23 17:02 GLEB-M

Hi, @GLEB-M

Sorry, I forgot to add the variable: private static readonly Regex emojiRegex = new Regex(EmojiData.MatchOne.ToString(), RegexOptions.None);

Swindler95 avatar Feb 15 '23 09:02 Swindler95

Swindler95, I tried to use your code, but this doesn't work properly

изображение

GLEB-M avatar Feb 15 '23 13:02 GLEB-M

@GLEB-M

I said it wasn't perfect 😄, in fact I'm only using this new method for input because I need to be able to type a large amount of text with few emojis, and it works very well. With the old SubstituteGlyphsInRange, the more text there is, the longer it takes to process and it becomes really unusable. However, I've developed a new class called "RichTextBlock" for displaying the text that uses the old SubstituteGlyphsInRange, which is more stable for displaying all emojis correctly.

I'll continue working on improving this method, but if anyone else wants to improve it, I'm open to it. 😉

Swindler95 avatar Feb 15 '23 15:02 Swindler95

I did some profiling on this project and i see sostituteglyphinrange is called a ton of times. I was thinking: maybe one improvement could be render a line then cache it, render everything only if the container is resized horizontally (vertically for top down languages?) Not sure what the hlsl nuget package is doing but if we could store those cached bitmaps in graphic card mem, that would be even faster, and memory efficient

TrabacchinLuigi avatar Jun 26 '23 23:06 TrabacchinLuigi