spark-sql-perf icon indicating copy to clipboard operation
spark-sql-perf copied to clipboard

setup the benchmark

Open idragus opened this issue 7 years ago • 3 comments

Hi,

I'm using spark 1.6.0 and I want to run the benchmark. However, I need first to setup the benchmark (I guess).

In the tutorial it's written that we have to execute those lines:

import com.databricks.spark.sql.perf.tpcds.Tables val tables = new Tables(sqlContext, dsdgenDir, scaleFactor) tables.genData(location, format, overwrite, partitionTables, useDoubleForDecimal, clusterByPartitionColumns, filterOutNullPartitionValues) // Create metastore tables in a specified database for your data. // Once tables are created, the current database will be switched to the specified database. tables.createExternalTables(location, format, databaseName, overwrite) // Or, if you want to create temporary tables tables.createTemporaryTables(location, format) // Setup TPC-DS experiment import com.databricks.spark.sql.perf.tpcds.TPCDS val tpcds = new TPCDS (sqlContext = sqlContext)

I understood that I have to run "spark-shell" first in order to run those lines, but the problem is that when i do "import com.databricks.spark.sql.perf.tpcds.Tables" I got an error " error: object sql is not a member of package com.databricks.spark". In "com.databricks.spark" there is only the "avro" package (I don't really know what it is)

Could you help me please, maybe I understood something wrong?

Thanks

idragus avatar Apr 03 '17 14:04 idragus

Make sure you create a jar of spark-sql-perf (using sbt) . When starting spark-shell use the command --jars and point it to that jar. e.g., ./bin/spark-shell --jars /Users/xxx/yyy/zzz/spark-sql-perf/target/scala-2.11/spark-sql-perf_2.11-0.5.0-SNAPSHOT.jar

jeevanks avatar Oct 23 '17 11:10 jeevanks

I solve this problem with :require /path/to/file.jarin spark-shell

gdchaochao avatar Mar 04 '19 16:03 gdchaochao

@gdchaochao maybe using a absolute path in --jars would also solve it? In your previous comment you wrote that your command was spark-shell --conf spark.executor.cores=3 --conf spark.executor.memory=8g --conf spark.executor.memoryOverhead=2g --jars ./spark-perf/spark-sql-perf/target/scala-2.11/spark-sql-perf_2.11-0.5.1-SNAPSHOT.jar - with a relative path to the jar.

juliuszsompolski avatar Mar 05 '19 10:03 juliuszsompolski