distributed
distributed copied to clipboard
`distributed.print()` breaks easily because of stringified kwargs
Describe the issue:
distributed.print()
is a great idea, but shouldn't it pickle its kwargs for proper deserialization by the client instead of stringifying them? My understanding (and the apparent intention from looking at the implementation in worker.py and client.py) is for distributed.print()
to be a drop-in replacement for builtins.print()
which workers can use to print stuff back to the client session, but by not truly serializing/deserializing the arguments, it breaks as a drop-in replacement.
Minimal Complete Verifiable Example:
from dask import distributed
from distributed import print as dask_print
client = distributed.Client()
# built-in print() works fine as expected
print("hello 1", file=None)
# dask_print works fine from the client session
dask_print("hello 2", file=None)
def do_print():
dask_print("hello 3", file=None)
# this demonstrates the bug.
# it rasises `AttributeError: 'str' object has no attribute 'write'`
# because `file=None` has become `file="None"`!
client.submit(do_print)
I'm sure you can think of other ways that this breaks. For example, print(..., end=None)
becomes print(..., end="None")
(hehe) and print(..., flush=False)
becomes print(..., flush="False")
(so it WILL flush).
Environment:
- Dask version: Tested with
2022.9.1
but master still looks to be affected. - Python version: 3.10.x
- Operating System: Linux
- Install method (conda, pip, source): pip
Thought I'd ping @fjetter here since I saw he added the remote print
method in PR #5217 and probably knows best why it works the way it does.
This looks like a genuine bug that the file and other keywords aren't respected well. My sense is that a fix would be welcome.
However, it's also worth noting that pickling things is often the wrong choice. For example, you couldn't pickle a file object. I think that we should special-case some of the print-specific keywords, but keep passing everything else as strings.
Max, if you're interested in working on this I'm sure that people would appreciate it.
On Sat, Oct 1, 2022 at 6:36 PM Max Bane @.***> wrote:
Thought I'd ping @fjetter https://github.com/fjetter here since I saw he added the remote print method in PR #5217 https://github.com/dask/distributed/pull/5217 and probably knows best why it works the way it does.
— Reply to this email directly, view it on GitHub https://github.com/dask/distributed/issues/7095#issuecomment-1264523989, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTB4FOVBL7KBV4YH45LWBDRK7ANCNFSM6AAAAAAQ2T3AA4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Cool, yeah, I'll take a crack at it.
Thanks @maxbane. Contributing docs are at https://docs.dask.org/en/stable/develop.html -- let us know if you have any questions
I've submitted PR #7129 -- we can continue the discussion over there and eventually close this issue if it looks good.