hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[HUDI-4885] Adding org.apache.avro to hudi-hive-sync bundle

Open nsivabalan opened this issue 3 years ago • 1 comments

Change Logs

After we landed https://github.com/apache/hudi/pull/6472, hive-sync in docker demo is broken. It fails w/ below stacktrace.

2022-09-20 14:24:39,758 INFO  [main] table.TableSchemaResolver (TableSchemaResolver.java:readSchemaFromParquetBaseFile(439)) - Reading schema from /user/hive/warehouse/stock_ticks_cow/2018/08/31/b4a7076c-30e6-4320-bb04-be47246b6646-0_0-29-29_20220920142351042.parquet
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
2022-09-20 14:24:40,432 INFO  [main] hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to metastore, current connections: 0
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/avro/LogicalType
	at org.apache.hudi.common.table.TableSchemaResolver.convertParquetSchemaToAvro(TableSchemaResolver.java:288)
	at org.apache.hudi.common.table.TableSchemaResolver.getTableAvroSchemaFromDataFile(TableSchemaResolver.java:121)
	at org.apache.hudi.common.table.TableSchemaResolver.hasOperationField(TableSchemaResolver.java:566)
	at org.apache.hudi.util.Lazy.get(Lazy.java:53)
	at org.apache.hudi.common.table.TableSchemaResolver.getTableSchemaFromLatestCommitMetadata(TableSchemaResolver.java:225)
	at org.apache.hudi.common.table.TableSchemaResolver.getTableAvroSchemaInternal(TableSchemaResolver.java:193)
	at org.apache.hudi.common.table.TableSchemaResolver.getTableAvroSchema(TableSchemaResolver.java:142)
	at org.apache.hudi.common.table.TableSchemaResolver.getTableParquetSchema(TableSchemaResolver.java:173)
	at org.apache.hudi.sync.common.HoodieSyncClient.getStorageSchema(HoodieSyncClient.java:103)
	at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:206)
	at org.apache.hudi.hive.HiveSyncTool.doSync(HiveSyncTool.java:153)
	at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:141)
	at org.apache.hudi.hive.HiveSyncTool.main(HiveSyncTool.java:358)
Caused by: java.lang.ClassNotFoundException: org.apache.avro.LogicalType
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 13 more 

Adding back the apache-avro, but w/o shading to hudi-hive-sync bundle.

Impact

Docker demo fails w/o this patch.

**Risk level: low **

Choose one. If medium or high, explain what verification was done to mitigate the risks.

Contributor's checklist

  • [ ] Read through contributor's guide
  • [ ] Change Logs and Impact were stated clearly
  • [ ] Adequate tests were added if applicable
  • [ ] CI passed

nsivabalan avatar Sep 20 '22 22:09 nsivabalan

CI report:

  • 95d5bc16716f6e553d2e34e779ef76e476161a22 Azure: SUCCESS
Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

hudi-bot avatar Sep 29 '22 09:09 hudi-bot