repo2docker
repo2docker copied to clipboard
line buffering (buffering=1) warning on Python 3.8
Since the recent change made in #1014, I got a warning like below when launching images based on Python 3.8 (python=3.8 in environment.yml).
/srv/conda/envs/notebook/lib/python3.8/subprocess.py:848: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
I believe this issue is related to the change in Python 3.8 where it started complaining whereas previous versions silently ignored buffering=1 for binary mode. I guess the related code is below:
https://github.com/jupyterhub/repo2docker/blob/a5f5bbbb75a9945d1f8fe8f8ff4844dfd4481742/repo2docker/buildpacks/repo2docker-entrypoint#L40-L46
Not sure if related, I also noticed texts on the console (mostly) lost colors after this recent change; Jupyter log messages used to have colored headings for warning/info/etc, but they are now all monochrome. Yet some texts are still printed in colors (i.e. Julia banner lost colors, but its prompt still has a color).
@manics what do you think about this issue?
I think the buffering=1 in the comment is a typo and should be bufsize=1 (like in the linked code sample). If you agree that it is a typo in the comment: I think this is an easy thing to fix.
The fact that colours have stopped working is annoying and we should try and get that fixed. @tomyun do you have any idea why "colours stop working" when you read in and output the output off a process? Do you have time to investigate that a bit?
- Somehow I failed to include a complete warning message.
/srv/conda/envs/notebook/lib/python3.8/subprocess.py:848: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
self.stdout = io.open(c2pread, 'rb', bufsize)
The warning about buffering=1 is coming from io.open() accepts buffering which is set to bufsize when io.open() is called in subprocess.Popen(). AFAICT, it's not about a typo between buffering and bufsize, but more about the value itself.
- Regarding color output, I guess it's about the difference between pipe and tty. For example, Julia running on a regular terminal clearly shows it's on TTY.
julia> Sys.stdout
Base.TTY(RawFD(14) open, 0 bytes waiting)
On the other hand, Julia running on an image created by the latest repo2docker has stdout mapped to pipe which is as expected with the recent change.
julia> Sys.stdout
Base.PipeEndpoint(RawFD(12) open, 0 bytes waiting)
It turns out Julia initializes colored stdout for non-TTY only when it's given an explicit option (color=yes, not default auto) that explains my experience.
It's likely Jupyter (and IPython) works in a similar way that they ended up getting a non-color output here. Seems like this kind of issue is not uncommon as Python has a package like ptyprocess.
Before we fix the TTY issue we should decide what beahaviour we want.
If you're running repo2docker from the command line with default options it'll output the logs to the console, in which case colour logs makes sense, and doing this through a pty(?) makes sense since in principle someone could provide input via stdin. If it's run as a background container or a Kubernetes pod you wouldn't expect a terminal to be connected so non-colour logs makes sense.
Thanks for taking the time to explain @tomyun! I had thought this was much simpler than it is :-/
Is it easier to always support colour output? I think that is what repo2docker used to do and that detecting in which scenario we are sounds like a tricky thing to do (and get right).
And it sounds like there are two issues: (1) find a way to not trigger the warning and keep buffering small (I think this is on purpose to solve the problem of very short output. getting lost) and (2) colour output. Should we make one issue for each? (In practice: a new issue for colour and use this one for buffering)
I've created a new issue https://github.com/jupyterhub/repo2docker/issues/1033 for the log colourisation, let's keep this issue just for the line buffering problem.
I'm not familiar with history of the recent change, but is there any reason we can't use default buffer size (bufsize=-1 or leaving out = 8192 bytes) or maybe turning it off (bufsize=0)? bufsize=1 for binary mode is silently ignored and seems to fall back to the default buffer size anyways.
Is there any progress on this? In the meantime, how about reverting back to the previous code until we find a more solid solution on correctly handling log buffer and colored TTY output?
Is the line buffering problem just a warning, or is it breaking something?
I believe it's just a waring and if that's the case bufsize=1 should be removed as it's not used.