spark-ec2
spark-ec2 copied to clipboard
s3a filesystem added to core-site for ephemeral-hdfs and persistent-hdfs
s3a is the successor of s3n file system, s3a offers higher performance and support of larger files. For more details: https://wiki.apache.org/hadoop/AmazonS3
@dashcode Thanks for the PR. However I think for using S3a we need a hadoop-aws-sdk package which needs to be installed separately from HDFS ? Or in other words did you test if S3a works with spark-ec2 and this change ?
Hello @shivaram :)! I tested with a custom AMI with Hadoop 2.7.2 and it worked. But you're right, I have just tested with the original AMI and it doesn't work, we need the hadoop-aws-sdk package as you said.
Might it be useful for 2.0 branch with updated ami ?