distributed icon indicating copy to clipboard operation
distributed copied to clipboard

`distributed.print()` breaks easily because of stringified kwargs

Open maxbane opened this issue 2 years ago • 1 comments

Describe the issue: distributed.print() is a great idea, but shouldn't it pickle its kwargs for proper deserialization by the client instead of stringifying them? My understanding (and the apparent intention from looking at the implementation in worker.py and client.py) is for distributed.print() to be a drop-in replacement for builtins.print() which workers can use to print stuff back to the client session, but by not truly serializing/deserializing the arguments, it breaks as a drop-in replacement.

Minimal Complete Verifiable Example:

from dask import distributed
from distributed import print as dask_print
client = distributed.Client()

# built-in print() works fine as expected
print("hello 1", file=None)

# dask_print works fine from the client session
dask_print("hello 2", file=None)

def do_print():
    dask_print("hello 3", file=None)
    
# this demonstrates the bug.
# it rasises `AttributeError: 'str' object has no attribute 'write'`
# because `file=None` has become `file="None"`!
client.submit(do_print)

I'm sure you can think of other ways that this breaks. For example, print(..., end=None) becomes print(..., end="None") (hehe) and print(..., flush=False) becomes print(..., flush="False") (so it WILL flush).

Environment:

  • Dask version: Tested with 2022.9.1 but master still looks to be affected.
  • Python version: 3.10.x
  • Operating System: Linux
  • Install method (conda, pip, source): pip

maxbane avatar Oct 01 '22 21:10 maxbane

Thought I'd ping @fjetter here since I saw he added the remote print method in PR #5217 and probably knows best why it works the way it does.

maxbane avatar Oct 02 '22 01:10 maxbane

This looks like a genuine bug that the file and other keywords aren't respected well. My sense is that a fix would be welcome.

However, it's also worth noting that pickling things is often the wrong choice. For example, you couldn't pickle a file object. I think that we should special-case some of the print-specific keywords, but keep passing everything else as strings.

Max, if you're interested in working on this I'm sure that people would appreciate it.

On Sat, Oct 1, 2022 at 6:36 PM Max Bane @.***> wrote:

Thought I'd ping @fjetter https://github.com/fjetter here since I saw he added the remote print method in PR #5217 https://github.com/dask/distributed/pull/5217 and probably knows best why it works the way it does.

— Reply to this email directly, view it on GitHub https://github.com/dask/distributed/issues/7095#issuecomment-1264523989, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTB4FOVBL7KBV4YH45LWBDRK7ANCNFSM6AAAAAAQ2T3AA4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

mrocklin avatar Oct 05 '22 15:10 mrocklin

Cool, yeah, I'll take a crack at it.

maxbane avatar Oct 06 '22 21:10 maxbane

Thanks @maxbane. Contributing docs are at https://docs.dask.org/en/stable/develop.html -- let us know if you have any questions

jrbourbeau avatar Oct 07 '22 00:10 jrbourbeau

I've submitted PR #7129 -- we can continue the discussion over there and eventually close this issue if it looks good.

maxbane avatar Oct 07 '22 21:10 maxbane