Coyote Codornices Marin
Coyote Codornices Marin
These would save the user from having to import `pyspark`, and could also set up `SparkConf` for you. Probably mostly matters for the inline runner (see #1965).
For complete support of `MRJob`, it would be helpful for the Spark harness to be able to be able to implement e.g. `mapper_cmd()`, `mapper_pre_filter()`. This isn't actually that difficult to...
We may want to consider eventually dropping support for Spark 1. At the very least, the spark harness which allows running MRJobs on the spark runner (see #1838) will be...
mockhadoop tests are slow and hard to debug. mockhadoop doesn't support generic options (e.g. `-D`, `-jobconf`).
It looks like Spark on YARN puts the following files/directories into a Spark container's working directory on EMR AMI 5.16.10 (which runs Spark 2.3.1 on Hadoop 2.8.4): ``` .container_tokens.crc .default_container_executor.sh.crc...
Apparently it's possible for a bootstrap action to run without aws-cli's credential's being set up: ``` TERMINATED_WITH_ERRORS Looking for bootstrap logs in s3://mrjob-35cdec11663cb1cb/tmp/logs/j-20DP5HLE2OHLR/node/i-074db8689d97a6a7f/bootstrap-actions/1... Parsing boostrap stderr log: s3://mrjob-35cdec11663cb1cb/tmp/logs/j-20DP5HLE2OHLR/node/i-074db8689d97a6a7f/bootstrap-actions/1/stderr.gz Probable cause...
Currently, `mrjob.wrap._ls_logs()` just swallows IOErrors. This could be a problem; say you successfully fetch some logs via SSH, and then the cluster shuts down. The right thing to do is...
Sometimes mrjob is being run in an environment where only the `StepFailedException` gets through. It might be helpful to be able to tag `StepFailedException` with arbitrary information (e.g. `cluster_id`).
While updating mrjob to support custom AMIs (#1805) is a good start, it's still significant work to roll your own AMI and keep it up-to-date. Instead, mrjob should look to...
mrjob has historically mocked out various AWS services. Currently this code lives in `tests/mock_boto3`. The [moto](https://github.com/spulec/moto) library does basically the same thing. mrjob should probably try to move to moto,...