dumbo issues

Implement params access via global variable like os.environ

In some cases it could be useful to store all commandline -param args in global variable like dumbo.params. Yes, I know that self.params provides such functionality. But if I want...

a4tunado

Add access to filepath in MultiMapper

Currently MultiMapper has no access to filepaths of the input lines. This is because current implementation of MultiMapper.**call***key functions use filepaths to distribute input lines between (sub)mappers and then implicitly...

mshevelev

The filenames don't get escaped in output

I was running a job that outputted to 'twoo/flowanalysis/2012/09/*', but this gives issues because when dumbo runs the hdfs (re)move operations (on overwrite="yes" for instance), it doesn't escape it properly...

poison

dumbo can't run well in hadoop 1.0.3?

[zhouhh@Hadoop48 examples]$ dumbo start wordcount.py -hadoop /home/zhouhh/hadoop-1.0.3 -input input1 -output output1 zhh parse argv: ['/usr/local/bin/dumbo', 'start', 'wordcount.py', '-hadoop', '/home/zhouhh/hadoop-1.0.3', '-input', 'input1', '-output', 'output1'] zhh sysargv: ['wordcount.py', '-prog', 'wordcount.py', '-input', 'input1',...

ablozhou

hadoopy backend?

4

Hey, I really love the job management stuff in dumbo. However, it seems like the inner-core of hadoopy is more highly optimized. (I get a factor of 2 better performance...

dgleich

enhancement

dedicated decorator for specifying parser for a single mapper

1

(Used to be "parser attribute on a single mapper gets applied to others in same MultiMapper".)

gamboviol

enhancement

Permit stdout redirection to avoid broken pipes

1

One of the frustrating problems I've been running into is that if I have "print statements" in code called by my mapper/reducer this will break the pipe used by my...

jlewi

enhancement

no error reported by Hadoop in case of immediate failure

4

_As originally [reported](http://dumbo.assembla.com/spaces/dumbo/tickets/61) by Elias Pampalk:_ The following scripts demonstrate a failure to fail when executed on a hadoop cluster (fails fine if executed locally): ``` import dumbo def mapper(k,...

klbostee

bug

some issue when trying to use the new DAG functionality in the 0.21.29 release

3

See the traceback from the logs below. Traceback (most recent call last): File "/usr/lib/python2.6/runpy.py", line 122, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.6/runpy.py", line 34, in _run_code exec code...

asaelm

bug

dumbo
dumbo copied to clipboard

Metadata

Implement params access via global variable like os.environ

Add access to filepath in MultiMapper

The filenames don't get escaped in output

dumbo can't run well in hadoop 1.0.3?

hadoopy backend?

dedicated decorator for specifying parser for a single mapper

Permit stdout redirection to avoid broken pipes

no error reported by Hadoop in case of immediate failure

some issue when trying to use the new DAG functionality in the 0.21.29 release

← Metadata

Owner

Metadata

dumbo dumbo copied to clipboard

Metadata

← Metadata

Owner

Metadata

dumbo
dumbo copied to clipboard