kafka-connect-hdfs icon indicating copy to clipboard operation
kafka-connect-hdfs copied to clipboard

Adding Hive partition threw unexpected error

Open Kimakjun opened this issue 3 years ago • 0 comments

Currently, kafka-connect-hdfs3:1.1.9 is used to consumer data from kafka and load it into hdfs. Hive tables are also created based on consumer data. Partitioning is in progress by time unit. Occasionally, the following error occurs, but I can't figure out the cause, so I leave a question.

It's probably happening in that code. https://github.com/confluentinc/kafka-connect-hdfs/blob/1cbceff8f64774b7d4550520878e67419b63188c/src/main/java/io/confluent/connect/hdfs/TopicPartitionWriter.java#L936

If you look at the Caused by part, I think it is correct that the folder is created at that time. but I wonder why it tries to delete the folder at that time.

Caused by: MetaException
(message:Got exception: org.apache.hadoop.hive.metastore.api.MetaException Unable to delete directory: 
hdfs://pct/user/gfd-user/topics/gfd-event/year=2022/month=09/day=20/hour=07)
Adding Hive partition threw unexpected error (io.confluent.connect.hdfs3.TopicPartitionWriter:836)
io.confluent.connect.storage.errors.HiveMetaStoreException: Hive MetaStore exception
	at io.confluent.connect.storage.hive.HiveMetaStore.doAction(HiveMetaStore.java:99)
	at io.confluent.connect.storage.hive.HiveMetaStore.addPartition(HiveMetaStore.java:132)
	at io.confluent.connect.hdfs3.TopicPartitionWriter$3.call(TopicPartitionWriter.java:834)
	at io.confluent.connect.hdfs3.TopicPartitionWriter$3.call(TopicPartitionWriter.java:830)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: MetaException(message:Got exception: org.apache.hadoop.hive.metastore.api.MetaException Unable to delete directory: hdfs://pct/user/gfd-user/topics/gfd-event/year=2022/month=09/day=20/hour=07)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$append_partition_by_name_result$append_partition_by_name_resultStandardScheme.read(ThriftHiveMetastore.java)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$append_partition_by_name_result$append_partition_by_name_resultStandardScheme.read(ThriftHiveMetastore.java)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$append_partition_by_name_result.read(ThriftHiveMetastore.java)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_append_partition_by_name(ThriftHiveMetastore.java:2557)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.append_partition_by_name(ThriftHiveMetastore.java:2542)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.appendPartition(HiveMetaStoreClient.java:722)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.appendPartition(HiveMetaStoreClient.java:716)
	at sun.reflect.GeneratedMethodAccessor166.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:208)
	at com.sun.proxy.$Proxy43.appendPartition(Unknown Source)
	at io.confluent.connect.storage.hive.HiveMetaStore$1.call(HiveMetaStore.java:114)
	at io.confluent.connect.storage.hive.HiveMetaStore$1.call(HiveMetaStore.java:107)
	at io.confluent.connect.storage.hive.HiveMetaStore.doAction(HiveMetaStore.java:97)
	... 7 more

Although the error occurred, the partition folder was created normally and data was loaded normally.

Kimakjun avatar Sep 26 '22 01:09 Kimakjun