Telethon icon indicating copy to clipboard operation
Telethon copied to clipboard

Telethon messing up text without any content changes

Open porocode11 opened this issue 3 years ago ā€¢ 3 comments

Checklist

  • [x] The error is in the library's code, and not in my own.
  • [x] I have searched for this issue before posting it and there isn't a duplicate.
  • [x] I ran pip install -U https://github.com/LonamiWebs/Telethon/archive/master.zip and triggered the bug in the latest version.

Code that causes the issue

from telethon import TelegramClient
from telethon import TelegramClient, events

import asyncio

api_id = yyy
api_hash = "xxx"

client = TelegramClient("test", api_id, api_hash)

## Sending this message to source triggers the bug
"""
[Text within brackets]
šŸ’„(http://google.com/) Header Title
"""

async def received_message(event):
    ## To send the message
    channel_2 = await client.get_entity(xxx)

    event.message.text = event.message.text ## Needed to trigger the bug


    event.message.entities = []

    await client.send_message(channel_2, event.message)

async def main():

    await client.get_dialogs()

    ## First chat, where we receive a message so we can trigger the bug
    channel_1 = await client.get_entity(yyy)

    ## Setting up the event
    client.add_event_handler(
        received_message, events.NewMessage(chats=[channel_1])
    )

    await asyncio.sleep(5000)


with client:
    client.loop.run_until_complete(main())

What happens? Telethon messes up the "[Text within brackets]" text and outputs "Text within brackets][". This happens if the brackets are before an emoji with link.

What should happen? Telethon correctly parsing brackets when close to emoji with links

porocode11 avatar Jul 12 '21 16:07 porocode11

I cannot reproduce the bug with the text you posted.

Lonami avatar Aug 22 '21 11:08 Lonami

@Lonami Did you converted the link url to actual link on telegram when sending it? The bug is only triggered if its an actual link not just text.

You need to send the message as this [Text within brackets] šŸ’„Header Title

Edit: Looking a bit deeper into this it seems that somehow the message entities are messed even with just an assigement. In the message above the message entities go from MessageEntityTextUrl(offset=25, length=12, url='https://google.com/') to MessageEntityTextUrl(offset=0, length=37, url='https://google.com/')

porocode11 avatar Sep 24 '21 14:09 porocode11

The following code:

from telethon.extensions import markdown

m = '''[Text within brackets]
šŸ’„[Header Title](http://google.com/)'''
print(len(m), repr(m))
t, e = markdown.parse(m)
print(len(t), repr(t), *e)
m = markdown.unparse(t, e)
print(len(m), repr(m))

Produces the following output:

58 '[Text within brackets]\nšŸ’„[Header Title](http://google.com/)'
36 'Text within brackets]\nšŸ’„[Header Title' MessageEntityTextUrl(offset=0, length=37, url='http://google.com/')
58 '[Text within brackets]\nšŸ’„[Header Title](http://google.com/)'

This is probably caused due to the markdown parser being too greedy (so it thinks the link starts at [Text and ends at Title], after which the URL starts. This will be fixed when the library switches to commonmark.

Lonami avatar Sep 24 '21 17:09 Lonami