mrjob
mrjob copied to clipboard
drop support for Spark 1?
We may want to consider eventually dropping support for Spark 1.
At the very least, the spark harness which allows running MRJobs on the spark runner (see #1838) will be much harder to write with Spark 1's poor method serialization. Also, it would be nice to be able to use configuration properties to specify the Python binary, rather than environment variables.
Spark 1 is available on EMR's 4.x AMIs, which are not yet considered deprecated.