pycord icon indicating copy to clipboard operation
pycord copied to clipboard

Event loop stalls even with an idle bot

Open mik111111 opened this issue 4 months ago • 7 comments

Summary

The event loop stalls and results in "heartbeat blocked" warnings even with an entirely idle bot

Reproduction Steps

  1. Make a bot that does literally anything (including nothing at all)
  2. Run it
  3. See warnings like:
Shard ID None heartbeat blocked for more than 10 seconds.
Loop thread traceback (most recent call last):
  File "/tmp/test.py", line 13, in <module>
    asyncio.run(main())
  File "/usr/lib/python3.13/asyncio/runners.py", line 195, in run
    return runner.run(main)
  File "/usr/lib/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
  File "/usr/lib/python3.13/asyncio/base_events.py", line 712, in run_until_complete
    self.run_forever()
  File "/usr/lib/python3.13/asyncio/base_events.py", line 683, in run_forever
    self._run_once()
  File "/usr/lib/python3.13/asyncio/base_events.py", line 2012, in _run_once
    event_list = self._selector.select(timeout)
  File "/usr/lib/python3.13/selectors.py", line 452, in select
    fd_event_list = self._selector.poll(timeout, max_ev)

Minimal Reproducible Code

import discord, asyncio, logging

logging.basicConfig(
        format="%(asctime)s | %(levelname)s | %(message)s",
        datefmt="%Y-%m-%d %H:%M:%S",
        level=logging.INFO)

bot = discord.Client()

async def main():
    await bot.start("TOKEN")

asyncio.run(main())

Expected Results

The bot runs correctly and without stall warnings

Actual Results

After about ~30 seconds the bot starts logging stall warnings every ~10 seconds.

Intents

None (it doesn't matter)

System Information

  • Python v3.13.7-final
  • py-cord v2.6.1-final
  • aiohttp v3.13.0
  • system info: Linux 6.17.2-arch1-1 #\1 SMP PREEMPT_DYNAMIC Sun, 12 Oct 2025 12:45:18 +0000

Checklist

  • [x] I have searched the open issues for duplicates.
  • [x] I have shown the entire traceback, if possible.
  • [x] I have removed my token from display, if visible.

Additional Context

The issue seems to be specific to running the bot using discord.Client.start() in a preexisting event loop. When running using discord.Client.run() it works as expected.

mik111111 avatar Oct 17 '25 20:10 mik111111

Can repro

Test Code
import discord, asyncio, logging
from os import getenv
from dotenv import load_dotenv

load_dotenv()

logging.basicConfig(
        format="%(asctime)s | %(levelname)s | %(message)s",
        datefmt="%Y-%m-%d %H:%M:%S",
        level=logging.INFO)

bot = discord.Client()

async def main():
    await bot.start(getenv("TOKEN_2"))

asyncio.run(main())
2025-10-18 14:20:19 | WARNING | PyNaCl is not installed, voice will NOT be supported
2025-10-18 14:20:19 | INFO | logging in using static token
2025-10-18 14:20:20 | INFO | Shard ID None has sent the IDENTIFY payload.
2025-10-18 14:20:21 | INFO | Shard ID None has connected to Gateway: ["gateway-prd-arm-us-east1-b-rgpb",{"micros":112440,"calls":["id_created",{"micros":652,"calls":[]},"session_lookup_time",{"micros":4112,"calls":[]},"session_lookup_finished",{"micros":8,"calls":[]},"discord-sessions-prd-2-37",{"micros":107246,"calls":["start_session",{"micros":96760,"calls":["discord-api-rpc-7447f9b866-prncs",{"micros":50727,"calls":["get_user",{"micros":5730},"get_guilds",{"micros":3739},"send_scheduled_deletion_message",{"micros":13},"guild_join_requests",{"micros":2150},"authorized_ip_coro",{"micros":11},"pending_payments",{"micros":37167},"apex_experiments",{"micros":303},"user_activities",{"micros":5},"played_application_ids",{"micros":3},"linked_users",{"micros":10}]}]},"starting_guild_connect",{"micros":39,"calls":[]},"presence_started",{"micros":8994,"calls":[]},"guilds_started",{"micros":84,"calls":[]},"lobbies_started",{"micros":1,"calls":[]},"guilds_connect",{"micros":2,"calls":[]},"presence_connect",{"micros":1338,"calls":[]},"connect_finished",{"micros":1344,"calls":[]},"build_ready",{"micros":11,"calls":[]},"clean_ready",{"micros":1,"calls":[]},"optimize_ready",{"micros":0,"calls":[]},"split_ready",{"micros":0,"calls":[]}]}]}] (Session ID: c75c1aed94aebfcd0e431a57d6fac079).
2025-10-18 14:21:10 | WARNING | Can't keep up, shard ID None websocket is 49.8s behind.
2025-10-18 14:21:12 | WARNING | Shard ID None heartbeat blocked for more than 10 seconds.
Loop thread traceback (most recent call last):
  File "C:\Users\jerem\Documents\pycord\pycord\test\test.py", line 17, in <module>
    asyncio.run(main())
  File "C:\Users\jerem\AppData\Roaming\uv\python\cpython-3.13.7-windows-x86_64-none\Lib\asyncio\runners.py", line 195, in run
    return runner.run(main)
  File "C:\Users\jerem\AppData\Roaming\uv\python\cpython-3.13.7-windows-x86_64-none\Lib\asyncio\runners.py", line 118, in run
    return self._loop.run_until_complete(task)
  File "C:\Users\jerem\AppData\Roaming\uv\python\cpython-3.13.7-windows-x86_64-none\Lib\asyncio\base_events.py", line 712, in run_until_complete
    self.run_forever()
  File "C:\Users\jerem\AppData\Roaming\uv\python\cpython-3.13.7-windows-x86_64-none\Lib\asyncio\base_events.py", line 683, in run_forever
    self._run_once()
  File "C:\Users\jerem\AppData\Roaming\uv\python\cpython-3.13.7-windows-x86_64-none\Lib\asyncio\base_events.py", line 2012, in _run_once
    event_list = self._selector.select(timeout)
  File "C:\Users\jerem\AppData\Roaming\uv\python\cpython-3.13.7-windows-x86_64-none\Lib\asyncio\windows_events.py", line 446, in select
    self._poll(timeout)
  File "C:\Users\jerem\AppData\Roaming\uv\python\cpython-3.13.7-windows-x86_64-none\Lib\asyncio\windows_events.py", line 775, in _poll
    status = _overlapped.GetQueuedCompletionStatus(self._iocp, ms)

2025-10-18 14:21:22 | WARNING | Shard ID None heartbeat blocked for more than 20 seconds.

System:

  • Python v3.13.7-final
  • py-cord v2.6.1-final OR
  • py-cord v2.7.None-candidate
    • py-cord importlib.metadata: v2.7.0rc2.dev49+gacbe07bdb
  • aiohttp v3.13.1
  • system info: Windows 11 10.0.26200

Note: this doesn't repro in the python debugger for some reason

Paillat-dev avatar Oct 18 '25 12:10 Paillat-dev

Hello ! Thank you for your issue. This is currently fixed by #2645. In the meantime I found that a workaround to your issue is adding bot.loop = get_running_loop() like follows to make sure that the bot uses the correct loop:

from asyncio import get_running_loop

# ...
async def main():
    bot.loop = get_running_loop()
    await bot.start(getenv("TOKEN_2"))

asyncio.run(main())

Please let me know if this workaround works for you

Paillat-dev avatar Oct 18 '25 12:10 Paillat-dev

Please let me know if this workaround works for you

Seems to work fine with this. Thanks

mik111111 avatar Oct 18 '25 20:10 mik111111

After some more testing I have ran into an issue with this workaround. The typing context manager can break when processing takes a while and the bot stops typing. After terminating the program I sometimes get:

task: <Task cancelling name='Task-8' coro=<Typing.do_typing() running at /usr/lib/python3.13/site-packages/discord/context_managers.py:54> cb=[_typing_done_callback() at /usr/lib/python3.13/site-packages/discord/context_managers.py:41]>
<sys>:0: RuntimeWarning: coroutine 'Typing.do_typing' was never awaited

(this can happen even long after the context should have been closed)

This issue also appears to only happen when running Client.start()

mik111111 avatar Oct 18 '25 21:10 mik111111

I am going to take a look.

Paillat-dev avatar Oct 19 '25 11:10 Paillat-dev

@mik111111 Sorry for the late reply. Is this issue still happening with the workaround ? If yes, could you share the minimal code snippet to reproduce it ? Thanks !

Paillat-dev avatar Nov 27 '25 17:11 Paillat-dev

Yes it still happens with the workaround. Some minimal code that causes it:

import discord, asyncio, logging

logging.basicConfig(
        format="%(asctime)s | %(levelname)s | %(message)s",
        datefmt="%Y-%m-%d %H:%M:%S",
        level=logging.INFO)

bot = discord.Client()

@bot.event
async def on_message(message):
    async with message.channel.typing():
        for i in range(1, 30):
            print(i)
            await asyncio.sleep(1)

async def main():
    bot.loop = asyncio.get_running_loop()
    await bot.start('TOKEN')


asyncio.run(main())

The bot stops typing after around 10 seconds. I can't reproduce the specific error message I shared previously anymore, but the issue still happens. Once again this seems to only happen when using bot.start() instead of bot.run()

mik111111 avatar Nov 27 '25 21:11 mik111111