redis-py icon indicating copy to clipboard operation
redis-py copied to clipboard

After Upgrading to 4.5.4 we're getting `Task was destroyed but it is pending!`

Open rohanerw opened this issue 2 years ago • 14 comments

Version: What redis-py and what redis version is the issue happening on? redis-py version - 4.5.4 python version - 3.10.10

Description: Description of your issue, stack traces from errors and code that reproduces the issue We recently bumped redis-py to its latest version 4.5.4. After that on sentry we started to see these exceptions Task was destroyed but it is pending!

There's nothing much in stack trace just this text.

Task was destroyed but it is pending!
task: <Task pending name='Task-634' coro=<Connection.disconnect() done, defined at /envs/django-proj/bin/python3.10/site-packages/redis/asyncio/connection.py:727> wait_for=<Future pending cb=[Task.task_wakeup()]>>

This is happening with the tasks that uses redis in between, to me it appears that there's some leak here. Somehow I'm not able to reproduce this locally otherwise I could have provided steps to reproduce.

Any help is appreciated and big thanks for building this cool library.

rohanerw avatar May 07 '23 10:05 rohanerw

@dvora-h As you're getting a 4.5.5 ready, I wonder if some of @kristjanvalur recent contributions address this!

chayim avatar May 07 '23 10:05 chayim

It looks like a task created specifically to disconnect. From what I can tell, this happens only in cluster code. Are you using a redis cluster and asyncio.RedisCluster?

kristjanvalur avatar May 07 '23 14:05 kristjanvalur

@kristjanvalur Yes we're using redis cluster via sentinel.

And no we're not using asyncio.RedisCluster

rohanerw avatar May 07 '23 17:05 rohanerw

And no we're not using asyncio.RedisCluster

Sorry, let me rephrase the question: What mechanism provided by redis.asyncio are you using to communicate with your cluster? Which redis classes do you use?

kristjanvalur avatar May 07 '23 21:05 kristjanvalur

What mechanism provided by redis.asyncio are you using to communicate with your cluster?

We're not using anything related to redis.asyncio to communicate with cluster.

Which redis classes do you use?

redis.sentinel.Sentinel

Here's a simple example.

def get_redis(read_only: bool)
  from redis.sentinel import Sentinel
  sentinel = Sentinel([(URL, PORT)])
  if read_only:
      return sentinel.slave_for(...)
  return sentinel.master_for(...)

@kristjanvalur I hope it clarifies my previous comment.

rohanerw avatar May 08 '23 04:05 rohanerw

Thanks. In that case, I can't see how that error is related to your activity. It is definitely from the async part of redis, and appears to stem from the RedisCluster class. Is it possible that you are using packages/middleware which does use async redis, and who are reporting this? Note that these are not "exceptions", they are warnings which the sentry integration is catching. Perhaps there is more information in sentry offering clues?

kristjanvalur avatar May 08 '23 09:05 kristjanvalur

Note that these are not "exceptions", they are warnings which the sentry integration is catching

But these appear as exceptions and being logged by asyncio logger. If they appeared like warnings then I should not have cared much about them.

it possible that you are using packages/middleware which does use async redis

I'm using channels_redis and it uses redis.asyncio

Perhaps there is more information in sentry offering clues?

Sadly no, it's just giving me the text I provided in desc.

rohanerw avatar May 08 '23 10:05 rohanerw

I see. Well, quickly browsing that repo shows that it is using the regular redis.asyncio.Redis class. In which case I suspect that this problem will disappear, as @chayim suggest, with recently merged pr #2695

kristjanvalur avatar May 08 '23 11:05 kristjanvalur

And... 4.5.5 was just released...

chayim avatar May 08 '23 13:05 chayim

@kristjanvalur Issue is still happening with 4.5.5. 😞

rohanerw avatar May 08 '23 15:05 rohanerw

Ok, I suspect the problem is this code from redis.asyncio.Connection

def __del__(self):
        try:
            if self.is_connected:
                loop = asyncio.get_running_loop()
                coro = self.disconnect()
                if loop.is_running():
                    loop.create_task(coro)
                else:
                    loop.run_until_complete(coro)
        except Exception:
            pass

Now, this code is sort of legit. Although it is not common to call socket.close() from a destructor. Usually it means that some other code wasn't run to the end correctly (and I'd personally just leave it up to underlying destructors to deal with it.)

Regardless: A new task is created to disconnect, shortly thereafter the event loop is stopped, without waiting for it to drain, something which asyncio.run() will normally do.

This happens normally at application shutdown. I suspect there is something in the way that you run (and terminate) your application which causes existing redis connections to be garbage collected, at a time when the loop is being shut down, and the loop shutdown code does not wait for all existing tasks to finish.

kristjanvalur avatar May 08 '23 16:05 kristjanvalur

I guess a much simpler "self.disconnect(nowait=True)" would be in order here.

kristjanvalur avatar May 08 '23 16:05 kristjanvalur

If you can control your setup, you can try with this PR here: https://github.com/redis/redis-py/pull/2755

kristjanvalur avatar May 08 '23 17:05 kristjanvalur

Can someone confirm that this is still relevant? I think it was fixed already

dvora-h avatar Feb 11 '24 01:02 dvora-h

It looks like this issue has been resolved, so I’ll close it for now. Feel free to reopen it if you need further assistance.

petyaslavova avatar Feb 04 '25 11:02 petyaslavova