training
training copied to clipboard
Training materials for Strata, AMP Camp, etc
Hello, On page "Introduction to Scala", command "Source.fromFile(xxx).getLines.toArray" threw an error at me "java.nio.charset.MalformedInputException: Input length = 1". It is an easy fix by setting the encoding to UTF-8. Please...
I followed all the instructions up to step 4 (http://ampcamp.berkeley.edu/3/exercises/launching-a-bdas-cluster-on-ec2.html) and i already have my key pair on my AWS console as instructed. First i ran this : `./spark-ec2 -i...
Following the steps mentioned on - http://ampcamp.berkeley.edu/3/exercises/realtime-processing-with-spark-streaming.html For Java Edited the Tutorial.java file along with twitter.txt (setting the corresponding credentials) However while running the command - ``` sbt/sbt package run...
Steps followed 1. Access key ID and Secret Access Key generated for AWS 2. Environment variable set 3. Git clone done 4. ``` training-scripts]# ./spark-ec2 -i /home/chai/Downloads/td-key-pair-virgina.pem -k td-key-pair-virginio --copy...
Hi, I implemented the exercise provided at [spark mllib training](http://ampcamp.berkeley.edu/big-data-mini-course/movie-recommendation-with-mllib.html). But in final part, it is recommended to implement a matrix factorization to improve the algorithm. I could not find...
I'm following the recent amplab tutorial using my own AWS account. Cluster launch finishes with an error "ERROR: Cluster health check failed for spark_ec2". I'd be grateful for pointers on...
Hi I would like to run the training exercises on a Google Compute Engine cluster as I don't have an account on Amazon AWS. I was able to copy the...
Running through the exercise code, here are some issues I found: Data Exploration using Spark SQL page: 1) "parquetFile" has been deprecated and the resulting code should be changed to...
Hello, On https://www.cs.berkeley.edu/~jey/ampcamp6/training/data-exploration-using-spark.html, the pyspark link seems to be broken on section 5. (the java/scala one doesn't load or is very slow as well). Thanks, N