telegramify-markdown Issue with escaping when there are kaomojis in the input

When converting text with special characters and kaomoji/special Unicode characters, sometimes, the library doesn't properly escape them according to Telegram's MarkdownV2 requirements, resulting in messages that are rejected by the Telegram API.

Minimal reproducible example:

import telegramify_markdown

test = telegramify_markdown.markdownify(
    "But wait now **Im interested how bout YOURSELF BUDDY????????!!! Pls ༼ﾉ◕ヮ◕༽ﾉ*:·ﾟ✧*"
)
print(test)

Outputs: But wait now __Im interested how bout YOURSELF BUDDY????????\!\!\! Pls ༼ﾉ◕ヮ◕༽ﾉ_:·ﾟ✧_

Which is rejected by Telegram: Bad Request: can't parse entities: Can't find end of Underline entity at byte offset 13

This can be hit by a LLM model's output with a high temperature. It happens both using markdownify and telegramify.

Apr 22 '25 10:04 Ziyann

This issue is being investigated

Apr 22 '25 11:04 sudoskys

Unfortunately this may be difficult to fix. The complexity of the project has reached a point where it is impossible to debug, and it is impossible to bridge the small gap between the parsing library and the non-standard Markdown. We must rebuild the library using message entities again.

Apr 22 '25 15:04 sudoskys

#55

Apr 22 '25 15:04 sudoskys