dumbo icon indicating copy to clipboard operation
dumbo copied to clipboard

Permit stdout redirection to avoid broken pipes

Open jlewi opened this issue 13 years ago • 1 comments

One of the frustrating problems I've been running into is that if I have "print statements" in code called by my mapper/reducer this will break the pipe used by my streaming job.

It seems like a simple change to dumbo can fix this. In core.py change typedbytes.PairedOutput(sys.stdout).writes(outputs)

to typedbytes.PairedOutput(sys.stdout).writes(outputs)

This way all we have to do is redirect stdout to stderr and extraneous print statements will no longer cause problems.

I've tried this out and it seems to work for me.

jlewi avatar May 09 '11 16:05 jlewi

I apologize for the cross post but this is how I fixed this problem in Hadoopy http://bwhite.github.com/hadoopy/#pipe-hopping-using-stdout-stderr-in-hadoopy-jobs

bwhite avatar Jul 09 '11 06:07 bwhite