aioredis-py
aioredis-py copied to clipboard
ConnectionResetError: [Errno 104] Connection reset by peer
Hi, everybody!
I use torando+aioredis, and recently i met this issue, below is traceback
my environ: aioredis==1.2.0 tornado==5.1.1
I use this method aioredis.create_redis_pool(**args)
to create pool
can anybody show me help? thx a lot.
`Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/tornado/web.py", line 1699, in _execute
result = await result
File "/views/notice.py", line 341, in get
items, page_size, total_page, total_size = await Notice.cache_or_api_list(notice_id_list, page_count, page_size)
File "models/notice.py", line 136, in cache_or_api_list
items = await cls.query_list(page_list)
File "models/notice.py", line 92, in query_list
items = await asyncio.gather(*[Notice.cache_or_api(notice_id) for notice_id in notice_id_list])
File "models/notice.py", line 37, in cache_or_api
info = await redis.execute('get', redis_key)
File "models/notice.py", line 37, in cache_or_api
info = await redis.execute('get', redis_key)
File "models/notice.py", line 37, in cache_or_api
info = await redis.execute('get', redis_key)
[Previous line repeated 11 more times]
File "/usr/local/lib/python3.6/site-packages/aioredis/connection.py", line 183, in _read_data
obj = await self._reader.readobj()
File "/usr/local/lib/python3.6/site-packages/aioredis/stream.py", line 94, in readobj
await self._wait_for_data('readobj')
File "/usr/local/lib/python3.6/asyncio/streams.py", line 464, in _wait_for_data
yield from self._waiter
File "/usr/local/lib/python3.6/asyncio/selector_events.py", line 723, in _read_ready
data = self._sock.recv(self.max_size)
ConnectionResetError: [Errno 104] Connection reset by peer`
i had the same ConnectionResetError
Can you please check out the latest master and test with that? Note that as of #891 the client has the same API as redis-py.
getting a few thousand of these a day when using Django Channels tested with Python 3.8.1 and Python 3.9.2 using aioredis 1.3.1 installed as child of channels
@djstein Are you getting this on production? If you're experiencing this in development, please refer to #930 for migration needs as v1 major version will not get any fixes.
This error still exists in aioredis==2.0.0
in production.
i had the same ConnectionResetError
from my test, when this occured, aioredis / redis-py / asyncio-redis cann't connect, but aredis can do, what is the difference in it?
Hi all, the problem lies with the ConnectionError
type. aioredis
implements its own ConnectionError(builtins.ConnectionError, RedisError
which causes problems because ConnectionResetError
is no longer a subclass of ConnectionError
. If one looks at line 37 of client.py
, it overwrites ConnectionError
so that the exception can no longer be caught on line 1067
....
Any workaround found for this issue ?
Same issue here in production. There's a firewall resetting connections after some time, and aioredis won't recover from this, which is a serious problem. Is there any known workaround?
Please, consider this issue in #1225.
This issue can be closed, it should've been solved with #1129.
No, sorry, it doesn't solve this issue. Quick test:
- Run a Redis instance locally:
$ docker run --rm -d -p 6379:6379 --name redis redis
- Run a simple
test.py
program that continually reads something from Redis. For example (ugly, but easy to translate toredis
):
from aioredis import Redis
import asyncio, time
redis = Redis()
loop = asyncio.get_event_loop()
loop.run_until_complete(redis.set("something", "something"))
while True:
print(loop.run_until_complete(redis.get("something")))
time.sleep(5)
- Kill the connection (technically this is a connection abort, not a connection reset, but it has the same effect and unveils the same underlying issue):
$ sudo ss -K dst [::1] dport = redis
-
The equivalent test program using
redis
instead ofaioredis
automagically recovers from the loss of connection and keeps going without error. -
aioredis
2.0.0 gives:
Traceback (most recent call last):
File "/home/***/.local/lib/python3.9/site-packages/aioredis/connection.py", line 815, in send_packed_command
await asyncio.wait_for(
File "/usr/lib64/python3.9/asyncio/tasks.py", line 442, in wait_for
return await fut
File "/home/***/.local/lib/python3.9/site-packages/aioredis/connection.py", line 797, in _send_packed_command
await self._writer.drain()
File "/usr/lib64/python3.9/asyncio/streams.py", line 387, in drain
await self._protocol._drain_helper()
File "/usr/lib64/python3.9/asyncio/streams.py", line 190, in _drain_helper
raise ConnectionResetError('Connection lost')
ConnectionResetError: Connection lost
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "***/test.py", line 9, in <module>
print(loop.run_until_complete(redis.get("something")))
File "/usr/lib64/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/home/***/.local/lib/python3.9/site-packages/aioredis/client.py", line 1063, in execute_command
await conn.send_command(*args)
File "/home/***/.local/lib/python3.9/site-packages/aioredis/connection.py", line 840, in send_command
await self.send_packed_command(
File "/home/***/.local/lib/python3.9/site-packages/aioredis/connection.py", line 829, in send_packed_command
raise ConnectionError(
aioredis.exceptions.ConnectionError: Error UNKNOWN while writing to socket. Connection lost.
-
aioredis
from master, i.e. with #1129:
Traceback (most recent call last):
File "/home/***/.local/lib/python3.9/site-packages/aioredis/connection.py", line 762, in disconnect
await self._writer.wait_closed()
File "/usr/lib64/python3.9/asyncio/streams.py", line 359, in wait_closed
await self._protocol._get_close_waiter(self)
File "/home/***/.local/lib/python3.9/site-packages/aioredis/connection.py", line 815, in send_packed_command
await asyncio.wait_for(
File "/usr/lib64/python3.9/asyncio/tasks.py", line 442, in wait_for
return await fut
File "/home/***/.local/lib/python3.9/site-packages/aioredis/connection.py", line 797, in _send_packed_command
await self._writer.drain()
File "/usr/lib64/python3.9/asyncio/streams.py", line 375, in drain
raise exc
File "/usr/lib64/python3.9/asyncio/selector_events.py", line 856, in _read_ready__data_received
data = self._sock.recv(self.max_size)
ConnectionAbortedError: [Errno 103] Software caused connection abort
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "***/test.py", line 9, in <module>
print(loop.run_until_complete(redis.get("something")))
File "/usr/lib64/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/home/***/.local/lib/python3.9/site-packages/aioredis/client.py", line 1063, in execute_command
await conn.send_command(*args)
File "/home/***/.local/lib/python3.9/site-packages/aioredis/connection.py", line 840, in send_command
await self.send_packed_command(
File "/home/***/.local/lib/python3.9/site-packages/aioredis/connection.py", line 829, in send_packed_command
raise ConnectionError(
aioredis.exceptions.ConnectionError: Error 103 while writing to socket. Software caused connection abort.
So we switched error messages, but the issue persists.
A workaround would be to call redis.connection_pool.disconnect()
before performing any operation before/after a long pause where a reset may happen.
@Enchufa2 thanks for running the test. What do you mean automagically? Is there code that they implement that we somehow missed or code that we did port that is not functioning properly? Does internal connection not handle this properly?
I mean that redis
somehow figures out that the connection is broken, disposes of it and opens a new one instead of failing. I'm not sure how redis
does this and therefore what's missing here, because unfortunately I'm not familiar with either codebases. But redis
's behaviour is what I would expect from a connection pool. Otherwise, one needs to think about whether there are connections and whether they are alive, which is contrary to the very abstraction of a connection pool, right?
@Enchufa2 can you change this:
https://github.com/aio-libs/aioredis-py/blob/dbdd0add63f986f2ed2d56c9736303d133add23c/aioredis/connection.py#L850
to if not self.is_connected:
Redis is checking using self._sock
, but we don't use self._sock
. This could be the underlying reason, though I'm not sure how we didn't catch this early on or if a PR changed this somehow.
Nope, this doesn't help, same error. By the way, I noticed that I switched the errors coming from the current release and the current master branch in my previous comment. Apologies, the comment is amended now.
This is interesting. It turns out I had redis
v3.5.3 installed, and this is the version that recovers from connection aborts and resets. I just updated to v4.0.2 and it shows this same issue.
Reported in https://github.com/redis/redis-py/issues/1772
@Enchufa2 I'm unable to reproduce this error in both redis main/master branch and aioredis==2.0.0. https://github.com/Andrew-Chen-Wang/aioredis-issue-778 unlike ss
, I'm using CLIENT KILL
and CLIENT KILL ADDR
with no success. When doing CLIENT LIST
, in both cases, a new connection is established, and this is shown via an incremented ID. I was using gitpod since I don't have a tool similar to ss
on Mac for killing sockets. There may be a chance CLIENT KILL
gave aioredis a warning beforehand, but if a Connection reset by peer
is occurring, then I can't imagine functionality similar to CLIENT KILL
not being performed.
If I use CLIENT KILL
, I see this with the current master branch:
Traceback (most recent call last):
File "***/test.py", line 9, in <module>
print(loop.run_until_complete(redis.get("something")))
File "/usr/lib64/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/home/***/.local/lib/python3.9/site-packages/aioredis/client.py", line 1064, in execute_command
return await self.parse_response(conn, command_name, **options)
File "/home/***/.local/lib/python3.9/site-packages/aioredis/client.py", line 1080, in parse_response
response = await connection.read_response()
File "/home/***/.local/lib/python3.9/site-packages/aioredis/connection.py", line 854, in read_response
response = await self._parser.read_response()
File "/home/***/.local/lib/python3.9/site-packages/aioredis/connection.py", line 367, in read_response
raw = await self._buffer.readline()
File "/home/***/.local/lib/python3.9/site-packages/aioredis/connection.py", line 301, in readline
await self._read_from_socket()
File "/home/***/.local/lib/python3.9/site-packages/aioredis/connection.py", line 250, in _read_from_socket
raise ConnectionError(SERVER_CLOSED_CONNECTION_ERROR)
aioredis.exceptions.ConnectionError: Connection closed by server.
We were also seeing this in production with aioredis==2.0.0
. Our solution was to remove the connection pool and rely on single connections (although this isn't ideal).
@Andrew-Chen-Wang I see you were unable to reproduce, is there any current plan to look into this further? Or would you like someone to put together a PR with a potential fix?
@cjdsellers I still can't reproduce this issue, so a PR is much appreciated as we're down a maintainer now and I'm stuck with work.
I totally understand regarding the bandwidth. Maybe @Enchufa2 could chime in here with a solution strategy, as he seems to be very familiar with the issue?
Tracking https://github.com/redis/redis-py/issues/1789
@cjdsellers My bet is that a possible solution would be to catch OSError
in https://github.com/aio-libs/aioredis-py/blob/224f843bd4b33d657770bded6f86ce33b881257c/aioredis/connection.py#L1423 but I haven't had the time to test this yet. I'm also stuck with work :(, and the workaround in https://github.com/aio-libs/aioredis-py/issues/778#issuecomment-984654896, although not ideal, works well.
OK, I could put together a PR for you guys to look at. I'll see what I can do over this festive period.
@Enchufa2 In the mean time I did try the workaround as you suggested in your comment, however maybe my implementation wasn't correct because the issue remained. If you're successfully using this work around, then some sort of code snippet would be much appreciated!
OK, I could put together a PR for you guys to look at. I'll see what I can do over this festive period.
There are situations in which aioredis
fails and redis
does not (for instance, after a CLIENT KILL
, see https://github.com/redis/redis-py/issues/1772). So keep in mind that aioredis
may have additional issues not present in redis
. In this regard, I would try a fix in redis
in the first place, and then port it here.
@Enchufa2 In the mean time I did try the workaround as you suggested in your comment, however maybe my implementation wasn't correct because the issue remained. If you're successfully using this work around, then some sort of code snippet would be much appreciated!
Currently I'm using aioredis
in a microservice that makes a snapshot of certain information every n minutes. The code is not public, but this is essentially what it does:
async def set_snapshot():
# ...
try:
s = snapshot.get()
await redis.set("snapshot-identifier", s)
finally:
# workaround for https://github.com/aio-libs/aioredis-py/issues/778
await redis.connection_pool.disconnect()
As I said, this fragment is currently called every n minutes, so destroying the pool for every call doesn't really have so much impact in my case. My original issue started when the infra people set up a firewall to automatically abort long-lasting connections.
So this simple fix https://github.com/redis/redis-py/pull/1832 solves the issues for redis
. That patch is required here but not sufficient, because the situation is more complex, as aioredis
did not catch up on how redis
manages command execution. redis
makes unconditional retries, e.g., here
https://github.com/redis/redis-py/blob/04b8d34e212723974b9b1f484fe7cd9e93f0e315/redis/client.py#L1171-L1175
However, aioredis
fails unless the error is a TimeoutError
here
https://github.com/aio-libs/aioredis-py/blob/a28e0e9a21f6b645952d888e65f29b433218fba5/aioredis/client.py#L1088-L1089
Porting https://github.com/redis/redis-py/pull/1832 and commenting out those two lines above solves this issue. But that's not a solution, obviously. Therefore, aioredis
would need to port the Retry
class from redis
first, or a similar mechanism, and then modify all the client functions to use call_with_retry
.