Coyote Codornices Marin

Results 50 issues of Coyote Codornices Marin

If you pass Hadoop a directory as input, it reads all non-"hidden" files (files whose names don't start with `_` or `.`) in that directory, but doesn't recurse into subdirectories...

Feature

When people use the same job flow for several jobs, they like to be able to just leave the same SSH tunnel open. Currently, ssh tunnels are tied to runners,...

Feature

Would be nice to have a way to run a script on the master node before running our job. Example applications: - copying jars to the local filesystem to support...

Feature

Looks like we should be able to automatically [create key pairs through the EC2 API](http://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_CreateKeyPair.html) so that SSH will always work. Some things to consider: - should be a way...

Feature

EMR's 3.x AMIs include Mahout 0.8. Would be great to have an awesome demo that uses Mahout, that anyone can run on EMR.

Feature

It would be nice to be able to use some of the built-in mappers/reducers from Java for effiency reasons (e.g. org.apache.hadoop.mapred.lib.aggregate.ValueAggregatorReducer). probably would look something like this: ``` def steps(self):...

Feature

Currently there seem to be no tests of what happens when a job run by the sim runners throws an exception. We need to test: - [ ] in inline...

Testing

We should enable the ability to use a custom machine image (see #1805) on Dataproc.

Feature

Now that Amazon bills by the second rather than the full hour, cluster pooling is not usually a good way to save money. However, it does save you from having...

Feature

This seems to be an issue specific to jobs and clusters that are currently running. Possibly we're using different values of "now", and the script is running a long time?

Bug