dumbo
dumbo copied to clipboard
Fix #62 Optional path argument in JoinMapper
Now to get source path from the mapper routine just add **kwargs to the arguments list. Here are some examples.
@dumbo.decor.primary
def map_primary(key, value, **kwargs):
key, value = value.strip().split('\t')
print >> sys.stderr, key, value, kwargs['path']
yield key, value
Or you can specify desired argument directly
@dumbo.decor.primary
def map_primary(key, value, path, **kwargs):
key, value = value.strip().split('\t')
print >> sys.stderr, key, value, path
yield key, value
Callable instances are also supported
@dumbo.decor.secondary
class MapSecondary(object):
def __call__(self, key, value, path, **kwargs):
key, value = value.strip().split(' ')
print >> sys.stderr, value, path
yield key, value
And previous mapper interface is working aswell
@dumbo.decor.primary
def map_primary(key, value):
key, value = value.strip().split('\t')
yield key, value
This approach allows easily extend interface to pass other arguments in the future
Sounds good! Will try to find some time to review and merge this soonish.