spark-sql-perf icon indicating copy to clipboard operation
spark-sql-perf copied to clipboard

TPC-DS.. dataGen.. format

Open kgebaly opened this issue 8 years ago • 2 comments

table.genData(tableLocation, format, overwrite, clusterByPartitionColumns, What value does format take when generating TPC-DS benchmarks?

kgebaly avatar Jun 18 '16 09:06 kgebaly

format is for type of data. So it has to mentioned as a string thats what i have found out in Tables.scala def genData( location: String, format: String, overwrite: Boolean, clusterByPartitionColumns: Boolean, filterOutNullPartitionValues: Boolean, numPartitions: Int)

e.g "text" so you can give something like tables.genData("/path/to_Data", "text", true, true, true, true, true)

npaluskar avatar Jun 21 '16 17:06 npaluskar

we can use parquet/avro etc. I tried with parquet.

sridharpothamsetti avatar Jun 24 '16 18:06 sridharpothamsetti