dumbo icon indicating copy to clipboard operation
dumbo copied to clipboard

Python module that allows one to easily write and run Hadoop programs.

Results 29 dumbo issues
Sort by recently updated
recently updated
newest added

In some cases it could be useful to store all commandline -param args in global variable like dumbo.params. Yes, I know that self.params provides such functionality. But if I want...

Currently MultiMapper has no access to filepaths of the input lines. This is because current implementation of MultiMapper.**call***key functions use filepaths to distribute input lines between (sub)mappers and then implicitly...

I was running a job that outputted to 'twoo/flowanalysis/2012/09/*', but this gives issues because when dumbo runs the hdfs (re)move operations (on overwrite="yes" for instance), it doesn't escape it properly...

[zhouhh@Hadoop48 examples]$ dumbo start wordcount.py -hadoop /home/zhouhh/hadoop-1.0.3 -input input1 -output output1 zhh parse argv: ['/usr/local/bin/dumbo', 'start', 'wordcount.py', '-hadoop', '/home/zhouhh/hadoop-1.0.3', '-input', 'input1', '-output', 'output1'] zhh sysargv: ['wordcount.py', '-prog', 'wordcount.py', '-input', 'input1',...

Hey, I really love the job management stuff in dumbo. However, it seems like the inner-core of hadoopy is more highly optimized. (I get a factor of 2 better performance...

enhancement

(Used to be "parser attribute on a single mapper gets applied to others in same MultiMapper".)

enhancement

One of the frustrating problems I've been running into is that if I have "print statements" in code called by my mapper/reducer this will break the pipe used by my...

enhancement

_As originally [reported](http://dumbo.assembla.com/spaces/dumbo/tickets/61) by Elias Pampalk:_ The following scripts demonstrate a failure to fail when executed on a hadoop cluster (fails fine if executed locally): ``` import dumbo def mapper(k,...

bug

See the traceback from the logs below. Traceback (most recent call last): File "/usr/lib/python2.6/runpy.py", line 122, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.6/runpy.py", line 34, in _run_code exec code...

bug