asgi_ipc
asgi_ipc copied to clipboard
Cannot receive on channel after restarting. Bug?
I have an issue and I am not sure if that is by design. When my program restarts it does not receive any more messages on a channel.
I prepared an example which you find further down.. At the 30th iteration I am simulating the restart of the program by simply doing this:
channel_layer_receive = None
channel_layer_receive = asgi.IPCChannelLayer(prefix="my_prefix")
After that the script keeps printing (None, None)
although it is still sending on the other channel layer.
Is that by design or a bug?
Example
import asgi_ipc as asgi
channel_layer_receive = asgi.IPCChannelLayer(prefix="my_prefix")
channel_layer_send = asgi.IPCChannelLayer(prefix="my_prefix")
i = 0
while i < 10:
i += 1
msg = "Message %s" % i
try:
channel_layer_send.send("my_channel", {"text": msg})
print("Sending %s" % msg)
except asgi.BaseChannelLayer.ChannelFull:
print("Dropped %s" % msg)
pass
print(channel_layer_receive.receive(["my_channel"]))
if i == 5:
channel_layer_receive = None
channel_layer_receive = asgi.IPCChannelLayer(prefix="my_prefix")
print("Done!")
Output
Sending Message 1
('my_channel', {'text': 'Message 1'})
Sending Message 2
('my_channel', {'text': 'Message 2'})
Sending Message 3
('my_channel', {'text': 'Message 3'})
Sending Message 4
('my_channel', {'text': 'Message 4'})
Sending Message 5
('my_channel', {'text': 'Message 5'})
Sending Message 6
(None, None)
Sending Message 7
(None, None)
Sending Message 8
(None, None)
Sending Message 9
(None, None)
Sending Message 10
(None, None)
Done!
Exception ignored in: <bound method MemoryDict.__del__ of <asgi_ipc.MemoryDict object at 0x7f59a875c390>>
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/asgi_ipc.py", line 311, in __del__
posix_ipc.ExistentialError: No shared memory exists with the specified name
Exception ignored in: <bound method MemoryDict.__del__ of <asgi_ipc.MemoryDict object at 0x7f59a88a6748>>
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/asgi_ipc.py", line 311, in __del__
posix_ipc.ExistentialError: No shared memory exists with the specified name
This is not by design, so it's probably a bug - while technically channel layers are allowed to drop messages, dropping 70 left in the queue is a bit much.
I would recommend using the Redis channel layer if you want something more reliable, it's much more proven.
I have only a small project on one machine. Redis would be an overhead I'd like to avoid. But thanks for the hint, though!
Oh, and I was not dropping 70 messages. I edited/condensed the code to make it more obvious what is happening. I also get an exception at the end. Not sure why, though. When I debug the code line per line I don't get one. Maybe an issue with execution speed or something?
The issue is caused by https://github.com/django/asgi_ipc/blob/f1d205c349c8e1175badd26ca6d91211f8b47399/asgi_ipc.py#L296.
unlink()
marks the shared memory for destruction once all processes have unmapped it. Source: http://semanchuk.com/philip/posix_ipc/
Not exactly sure how this works here because within a single process we have two SharedMemory
objects that map the same shared memory. However if I unlink()
one of the objects, the shared memory gets destroyed although the second objects has it still mapped.
Now this happens, minus the part where it says "after the last shm_unlink()":
"Even if the object continues to exist after the last shm_unlink(), reuse of the name shall subsequently cause shm_open() to behave as if no shared memory object of this name exists (that is, shm_open() will fail if O_CREAT is not set, or will create a new shared memory object if O_CREAT is set)." Source: http://www.opengroup.org/onlinepubs/009695399/functions/shm_unlink.html
The quick fix is to simply not call unlink()
, but then the shared memory needs to be unlinked manually by calling unlink_shared_memory(name)
.
Initially I had the issue with send()
and receive()
being in two separate scripts/processes. I will have to test whether the issue is actually the same.
Urgh, yes, that seems to be what this is; it behaves differently if it's two inside one process versus two in different processes.
I'm still very much tempted to try out a sqlite-based backend as a replacement for this shared memory stuff, given that the performance testing we did showed this was surprisingly slow.