mrjob icon indicating copy to clipboard operation
mrjob copied to clipboard

redo README

Open coyotemarin opened this issue 5 years ago • 0 comments

The mrjob README is pretty dated; it tries to sell mrjob as "the Python Hadoop streaming library" and doesn't talk about Spark features at all.

We should highlight things like:

  • mrjob spark-submit
    • archives supported across Spark installations
  • setup scripts
  • EMR cluster setup
  • mix and match Spark and Hadoop Streaming
  • Spark runner

coyotemarin avatar Dec 21 '19 23:12 coyotemarin