`multiprocessing.Queue`: Exceeding a certain amount of bytes in the queue prevents proper exit
Bug report
Bug description:
This terminates properly:
from multiprocessing import Queue
Queue().put(b"0" * 65514)
print("end")
while this prints 'end' and is then stuck:
from multiprocessing import Queue
Queue().put(b"0" * 65515)
print("end")
CPython versions tested on:
Latest identified version without the issue: 3.9.6 Oldest identified version with the issue: 3.9.18
Operating systems tested on:
Linux, macOS
What is the most recent version with the issue? 65515 works for me in IDLE on Win10 with 3.12.8 and 3.14.0a1.
On Ubuntu (22.04) and Macos (15.1.1) I have not identified a version not having the issue anymore. The latest I have tested on Ubuntu is 3.12.1 and on Macos it's 3.14.0a0 and they both have the issue.
Interesting that you don't have it on Win10! Waiting for more linux/macos users to confirm that I'm not crazy 🙏🏻
Confirmed it on Linux, for both 3.12 and main. FWIW, the script does print out end for me, but it hangs upon interpreter finalization, so it doesn't really show up when using IDLE or the REPL (which I suspect is why @terryjreedy couldn't reproduce).
pstack is showing that we're getting stuck waiting on a semaphore somewhere while joining a thread. I'll investigate.
Makes sense @ZeroIntensity, for your investigation note that it appeared between 3.9.6 (still ok) and 3.9.18 (has the issue).
3.9 - 3.11 are security-only branches and this bug wouldn't categorize as a security issue IMO (if you were talking about the labels; for bisecting commits, using the main branch is fine)
I do think this is possibly a security issue. It looks like this applies to any iterable greater than 65514, so if user input was passed to Queue.put, this is possibly a DOS. I'm speculating, though. I'll submit a patch once I figure it out and we'll go from there.
(It could be a DOS but you should consider the possible attack vectors first IMO)
(dfb1b9da8a4becaeaed3d9cffcaac41bcaf746f4 looks in the right period and touches the queue's closing logic.)
Upon further investigation, this looks unrelated to multiprocessing and is just a nasty side-effect of os.pipe. Apparently, writable files returned by pipe() have an internal limit of 65536 (i.e., the 16-bit integer limit), so attempting to write past that ends up hanging.
For example:
import os
read, write = os.pipe()
my_str = b"0" * 65536
os.write(write, my_str) # Passed, but buffer is now full
os.write(write, b"1") # Stuck!
This happens in C as well, so I doubt there's something we can do about that:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
int main() {
int fds[2];
pipe(fds);
char *str = malloc(65536);
memset(str, '0', 65536);
write(fds[1], str, 65536);
puts("Filled the buffer");
write(fds[1], "1", 1);
puts("We'll never get here");
return 0;
}
This is documented, though. From Wikipedia:
If the buffer is filled, the sending program is stopped (blocked) until at least some data is removed from the buffer by the receiver. In Linux, the size of the buffer is 65,536 bytes (64KiB).
I guess there's three options:
- Just document it in
os.pipeand write this off as wontfix. - Use some nasty hacks to raise an exception if more than 64KiB are in the pipe.
- Switch to something more versatile e.g. a socket (unless that has the same issue!)
@picnixz, what do you think the way to go would be?
(1) seems the most conservative and the less error-prone and hard for us; we should document it in os.pipe but in Queue as well for future users. It's an implementation detail but it could still be useful. Otherwise, (2) seems the second best approach since this would help detecting issues. I'm not sure which hacks you're thinking of though (can't you first convert the input into a list and check how many elements in the list there are? or use islice and check if you can take more than 64k elements?). I don't know for (3).
An alternative would be to have multiple pipes. If you've filled a pipe, then you create another one (so you yourself have a queue of pipes... though this is only an idea I'm not even sure would be efficient and not even sure would have a use case).
We can ask @gpshead for that (sorry Gregory for all the mentions today but today seems the multiprocessing/threading/pipes issues day!)
An alternative would be to have multiple pipes.
Yeah, that would be (3) in my comment. The question comes down to what kind of maintenance burden that will have.
Running the file in CommandPrompt I see a hard hang (must close window) in 3.12 and 3.13 but not in 3.14.0a1 & 3 (get prompt after running).
This is a duplicate of #85927.