Telethon
Telethon copied to clipboard
Telethon messing up text without any content changes
Checklist
- [x] The error is in the library's code, and not in my own.
- [x] I have searched for this issue before posting it and there isn't a duplicate.
- [x] I ran
pip install -U https://github.com/LonamiWebs/Telethon/archive/master.zip
and triggered the bug in the latest version.
Code that causes the issue
from telethon import TelegramClient
from telethon import TelegramClient, events
import asyncio
api_id = yyy
api_hash = "xxx"
client = TelegramClient("test", api_id, api_hash)
## Sending this message to source triggers the bug
"""
[Text within brackets]
š„(http://google.com/) Header Title
"""
async def received_message(event):
## To send the message
channel_2 = await client.get_entity(xxx)
event.message.text = event.message.text ## Needed to trigger the bug
event.message.entities = []
await client.send_message(channel_2, event.message)
async def main():
await client.get_dialogs()
## First chat, where we receive a message so we can trigger the bug
channel_1 = await client.get_entity(yyy)
## Setting up the event
client.add_event_handler(
received_message, events.NewMessage(chats=[channel_1])
)
await asyncio.sleep(5000)
with client:
client.loop.run_until_complete(main())
What happens? Telethon messes up the "[Text within brackets]" text and outputs "Text within brackets][". This happens if the brackets are before an emoji with link.
What should happen? Telethon correctly parsing brackets when close to emoji with links
I cannot reproduce the bug with the text
you posted.
@Lonami Did you converted the link url to actual link on telegram when sending it? The bug is only triggered if its an actual link not just text.
You need to send the message as this [Text within brackets] š„Header Title
Edit: Looking a bit deeper into this it seems that somehow the message entities are messed even with just an assigement. In the message above the message entities go from MessageEntityTextUrl(offset=25, length=12, url='https://google.com/') to MessageEntityTextUrl(offset=0, length=37, url='https://google.com/')
The following code:
from telethon.extensions import markdown
m = '''[Text within brackets]
š„[Header Title](http://google.com/)'''
print(len(m), repr(m))
t, e = markdown.parse(m)
print(len(t), repr(t), *e)
m = markdown.unparse(t, e)
print(len(m), repr(m))
Produces the following output:
58 '[Text within brackets]\nš„[Header Title](http://google.com/)'
36 'Text within brackets]\nš„[Header Title' MessageEntityTextUrl(offset=0, length=37, url='http://google.com/')
58 '[Text within brackets]\nš„[Header Title](http://google.com/)'
This is probably caused due to the markdown parser being too greedy (so it thinks the link starts at [Text
and ends at Title]
, after which the URL starts. This will be fixed when the library switches to commonmark
.