stream-reactor
stream-reactor copied to clipboard
Hive Sink: MatchError
Issue Guidelines
Please review these questions before submitting any issue?
What version of the Stream Rector are you reporting this issue for?
Are you running the correct version of Kafka/Confluent for the Stream reactor release?
yes
Do you have a supported version of the data source/sink .i.e Cassandra 3.0.9?
No
Have you read the docs?
Yes
What is the expected behaviour?
What was observed?
MatchError: it seems an exception due to an empty value in a JSON field ("field_value": "")
What is your Connect cluster configuration (connect-avro-distributed.properties)?
What is your connector properties configuration (my-connector.properties)?
connector.class=com.landoop.streamreactor.connect.hive.sink.HiveSinkConnector connect.hive.database.name=test connect.hive.hive.metastore.uris=thrift://xxxxx:9083 tasks.max=1 topics=testjson7 connect.hive.fs.defaultFS=hdfs://xXXXXX:8020 key.converter.schemas.enable=false value.converter.schemas.enable=false connect.hive.kcql=insert into test select * from testjson7 AUTOCREATE PARTITION_BY id WITH_FLUSH_INTERVAL = 10 value.converter=org.apache.kafka.connect.json.JsonConverter key.converter=org.apache.kafka.connect.json.JsonConverter connect.hive.hive.metastore=thrift
Please provide full log files (redact and sensitive information)
org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception. at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:560) at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:321) at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:224) at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:192) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: scala.MatchError: (xxx,{field_id=54, field_value=, field_name=am}) (of class scala.Tuple2) at com.landoop.streamreactor.connect.hive.sink.MapValueConverter$$anonfun$convert$1.apply(ValueConverter.scala:29) at com.landoop.streamreactor.connect.hive.sink.MapValueConverter$$anonfun$convert$1.apply(ValueConverter.scala:29) at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:221) at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:428) at com.landoop.streamreactor.connect.hive.sink.MapValueConverter$.convert(ValueConverter.scala:29) at com.landoop.streamreactor.connect.hive.sink.ValueConverter$.apply(ValueConverter.scala:12) at com.landoop.streamreactor.connect.hive.sink.HiveSinkTask$$anonfun$put$1.apply(HiveSinkTask.scala:61) at com.landoop.streamreactor.connect.hive.sink.HiveSinkTask$$anonfun$put$1.apply(HiveSinkTask.scala:60) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at com.landoop.streamreactor.connect.hive.sink.HiveSinkTask.put(HiveSinkTask.scala:60) at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:538) ... 10 more