spark-parquet-thrift-example icon indicating copy to clipboard operation
spark-parquet-thrift-example copied to clipboard

could support Thrift list?

Open tonglin0325 opened this issue 5 years ago • 0 comments

Creating sample Thrift data.

  • ExampleTable(role_id:1, role_name:test, friends_id:[1, 2, 3])
  • ExampleTable(role_id:1, role_name:test, friends_id:[1, 2, 3])
  • ExampleTable(role_id:1, role_name:test, friends_id:[1, 2, 3])
  • ExampleTable(role_id:1, role_name:test, friends_id:[1, 2, 3])
  • ExampleTable(role_id:1, role_name:test, friends_id:[1, 2, 3])
  • ExampleTable(role_id:1, role_name:test, friends_id:[1, 2, 3])
  • ExampleTable(role_id:1, role_name:test, friends_id:[1, 2, 3])
  • ExampleTable(role_id:1, role_name:test, friends_id:[1, 2, 3])
  • ExampleTable(role_id:1, role_name:test, friends_id:[1, 2, 3]) Writing sample data to Parquet.
  • ParquetStore: file:///home/lintong/下载/hive_table/test 14:06:40.191 ERROR org.apache.spark.executor.Executor:91 - Exception in task 0.0 in stage 0.0 (TID 0) java.lang.ArrayIndexOutOfBoundsException: -1 at org.apache.parquet.thrift.struct.ThriftType$StructType.(ThriftType.java:242) at org.apache.parquet.thrift.ThriftSchemaConverter.toStructType(ThriftSchemaConverter.java:110) at org.apache.parquet.thrift.ThriftSchemaConverter.toStructType(ThriftSchemaConverter.java:97) at org.apache.parquet.hadoop.thrift.TBaseWriteSupport.getThriftStruct(TBaseWriteSupport.java:55) at org.apache.parquet.hadoop.thrift.AbstractThriftWriteSupport.init(AbstractThriftWriteSupport.java:85) at org.apache.parquet.hadoop.thrift.AbstractThriftWriteSupport.init(AbstractThriftWriteSupport.java:112) at org.apache.parquet.hadoop.thrift.ThriftWriteSupport.init(ThriftWriteSupport.java:68) at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:341) at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:302) at org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.initWriter(SparkHadoopWriter.scala:344) at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:118) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:79) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 14:06:40.210 ERROR org.apache.spark.scheduler.TaskSetManager:70 - Task 0 in stage 0.0 failed 1 times; aborting job

tonglin0325 avatar Jan 04 '19 06:01 tonglin0325