spark-ec2 icon indicating copy to clipboard operation
spark-ec2 copied to clipboard

Using spark_version='1.6.2' results in partial installation (?)

Open arokem opened this issue 8 years ago • 10 comments

Specifically, I get these messages during launch of the cluster, and these files are indeed not in place once the cluster starts up:

./spark-ec2/spark-standalone/setup.sh: line 22: /root/spark/bin/stop-all.sh: No such file or directory
./spark-ec2/spark-standalone/setup.sh: line 27: /root/spark/bin/start-master.sh: No such file or directory    

Indeed, no spark web interface on port 8080 either

arokem avatar Oct 04 '16 03:10 arokem

Is this from branch-2.0 ? I think the problem is we didn't backport the change that added 1.6.1 and 1.6.2 to branch-2.0 as seen from [1]. Can you check if adding 1.6.2 there fixes the problem ?

[1] https://github.com/amplab/spark-ec2/blob/06f5d2bc7c222aecb56e2f7bb8b8e160bc501104/spark_ec2.py#L78

shivaram avatar Oct 04 '16 04:10 shivaram

This is from branch-1.6

arokem avatar Oct 04 '16 04:10 arokem

Hmm that means SPARK_VERSION isn't being correctly parsed somehow. Because as in [2] the default should be sbin and not bin

[2] https://github.com/amplab/spark-ec2/blob/4b57900a24e25accd9c3f14c867920730813bf11/spark-standalone/setup.sh#L5

shivaram avatar Oct 04 '16 04:10 shivaram

As far as I can tell, there's neither a bin nor a sbin directory under /root/spark. The only thing under /root/spark is: /root/spark/conf/spark-env.sh

On Mon, Oct 3, 2016 at 9:13 PM, Shivaram Venkataraman < [email protected]> wrote:

Hmm that means SPARK_VERSION isn't being correctly parsed somehow. Because as in [2] the default should be sbin and not bin

[2] https://github.com/amplab/spark-ec2/blob/ 4b57900a24e25accd9c3f14c867920730813bf11/spark-standalone/setup.sh#L5

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/amplab/spark-ec2/issues/57#issuecomment-251291593, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHPNoh-XHzaa-sDSe9b5615YDmnKWFyks5qwdJfgaJpZM4KNRnZ .

arokem avatar Oct 04 '16 16:10 arokem

That means that Spark wasn't downloaded properly - My guess is this has to do with the tar.gz files for hadoop1 not being found on S3. You could try --hadoop-version=yarn as a workaround

shivaram avatar Oct 04 '16 17:10 shivaram

Thanks! For now, I have resorted to set spark_version to 1.6.0, which seems to work, but I'll try that too. Feel free to close this issue, unless you want to keep track of this. And thanks again.

arokem avatar Oct 04 '16 17:10 arokem

I ran into something similar recently.

It appears that for spark 1.6.2 only a subset of the binaries were uploaded to s3:

$ s3cmd ls s3://spark-related-packages/spark-1.6.2*
2016-06-27 23:47 241425242   s3://spark-related-packages/spark-1.6.2-bin-cdh4.tgz
2016-06-27 23:47 230444067   s3://spark-related-packages/spark-1.6.2-bin-hadoop1-scala2.11.tgz
2016-06-27 23:48 271799224   s3://spark-related-packages/spark-1.6.2-bin-hadoop2.3.tgz
2016-06-27 23:49 273797124   s3://spark-related-packages/spark-1.6.2-bin-hadoop2.4.tgz
2016-06-27 23:50 278057117   s3://spark-related-packages/spark-1.6.2-bin-hadoop2.6.tgz
2016-06-27 23:50 196142809   s3://spark-related-packages/spark-1.6.2-bin-without-hadoop.tgz
2016-06-27 23:51  12276956   s3://spark-related-packages/spark-1.6.2.tgz

While

$ s3cmd ls s3://spark-related-packages/spark-1.6.0*
2015-12-27 23:07 252549861   s3://spark-related-packages/spark-1.6.0-bin-cdh4.tgz
2015-12-27 23:15 241526957   s3://spark-related-packages/spark-1.6.0-bin-hadoop1-scala2.11.tgz
2015-12-27 23:23 243448482   s3://spark-related-packages/spark-1.6.0-bin-hadoop1.tgz
2015-12-27 23:31 282904569   s3://spark-related-packages/spark-1.6.0-bin-hadoop2.3.tgz
2015-12-27 23:41 244381359   s3://spark-related-packages/spark-1.6.0-bin-hadoop2.4-without-hive.tgz
2015-12-27 23:48 284903527   s3://spark-related-packages/spark-1.6.0-bin-hadoop2.4.tgz
2015-12-28 00:00 289160984   s3://spark-related-packages/spark-1.6.0-bin-hadoop2.6.tgz
2015-12-28 00:08 201549664   s3://spark-related-packages/spark-1.6.0-bin-without-hadoop.tgz
2015-12-28 00:16  12204380   s3://spark-related-packages/spark-1.6.0.tgz

etrain avatar Oct 04 '16 21:10 etrain

I think the problem here is that the artifacts are missing from the release not just from s3. i.e. http://www-us.apache.org/dist/spark/spark-1.6.2/spark-1.6.2-bin-hadoop1.tgz gives me a 404

shivaram avatar Oct 20 '16 22:10 shivaram

As well as http://s3.amazonaws.com/spark-related-packages/spark-1.6.3-bin-hadoop1.tgz http://s3.amazonaws.com/spark-related-packages/spark-1.6.2-bin-hadoop1.tgz

sabman avatar Dec 02 '16 19:12 sabman

Seems the same issue as #43.

RudyLu avatar Jan 06 '17 09:01 RudyLu