mrjob icon indicating copy to clipboard operation
mrjob copied to clipboard

Ability to specify protocols for a specific step via MRStep

Open tarnfeld opened this issue 11 years ago • 3 comments

The ability to more explicitly assign an input/output/internal protocol for a step when in the Job.steps() method would be great. For example;


def steps(self):
    return [MRStep(mapper=None, reducer=None, output_protocol=FooBar),
              MRStep(mapper=None, reducer=None, output_protocol=Baz)]

The input_protocol to the second step would be inferred by the fact the output_protocol has been set on the previous step.

tarnfeld avatar Jul 01 '13 14:07 tarnfeld

This especially makes sense if some of your steps are JarSteps.

coyotemarin avatar Nov 08 '13 18:11 coyotemarin

Updated to use MRStep rather than self.mr(), for consistency with #815.

coyotemarin avatar Nov 15 '13 21:11 coyotemarin

Also pretty topical now that we're working on Spark.

coyotemarin avatar Jul 26 '16 18:07 coyotemarin