skypilot icon indicating copy to clipboard operation
skypilot copied to clipboard

[Experimental] Sky Service

Open michaelzhiluo opened this issue 2 years ago • 0 comments

Description

Adds Sky broker support for Sky services and provisioning/termination of Sky services, specifically for data analytics (EMR for AWS & Dataproc for GCP).

Experiments

Runs TPCDS SF 1/100 on a 3 node Spark cluster launched by Sky

Task YAML

service:
  type: data-analytics
  dependencies:
    spark: 3.1.2

resources:
  cloud: aws

num_nodes: 3

setup: |
  cd ~/
  source ~/.bashrc
  (sudo yum -y install gcc make flex bison byacc git htop tmux) || (sudo apt-get -y install gcc make flex bison byacc git htop tmux)
  git clone https://github.com/michaelzhiluo/spark-sql-perf.git
  git clone https://github.com/databricks/tpcds-kit.git
  cd tpcds-kit/tools
  make OS=LINUX
  mkdir -p ~/spark-warehouse/tpcds.db/ 

run: |
  cd ~/
  if [ $SKY_NODE_RANK == "0" ]
  then
    echo :quit | spark-shell --jars ~/spark-sql-perf/target/scala-2.12/spark-sql-perf_2.12-0.5.1-SNAPSHOT.jar -i ~/spark-sql-perf/scripts/gendata.scala
    echo :quit | spark-shell --jars ~/spark-sql-perf/target/scala-2.12/spark-sql-perf_2.12-0.5.1-SNAPSHOT.jar -i ~/spark-sql-perf/scripts/run.scala
  else
    echo "Worker node"
  fi

michaelzhiluo avatar Oct 07 '22 17:10 michaelzhiluo