hive icon indicating copy to clipboard operation
hive copied to clipboard

HIVE-29003: DataPartitionedLogger from Tez is always needed for hive logging hook

Open abstractdog opened this issue 8 months ago β€’ 13 comments

What changes were proposed in this pull request?

Include tez protobuf history logging jars into hive-exec fat jar.

Why are the changes needed?

Because it's needed in other components than HS2, where it's present.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manually observed jar

find . -name 'hive-exec*.jar' -exec sh -c 'jar tf {} | grep "DatePartitionedLogger" && echo "In file: {}"' \;

org/apache/tez/dag/history/logging/proto/DatePartitionedLogger.class
In file: ./ql/target/hive-exec-4.1.0-SNAPSHOT.jar

testing on the failure cluster is in progress

abstractdog avatar Jun 11 '25 12:06 abstractdog

@abstractdog, I am not able to repro this on hive master branch. in a fresh setup I ran:

schematool -dbType mysql -initSchema

and started HS2 and HMS services and then ran

schematool -dbType hive -initSchema -metaDbType mysql

but the "CREATE DATABASE IF NOT EXISTS SYS" didn't went to HS2 like it happened based on the logs attached in the JIRA. I suspect schematool is running with embedded mode in my case. What am I missing?

Aggarwal-Raghav avatar Jun 11 '25 16:06 Aggarwal-Raghav

@abstractdog, I am not able to repro this on hive master branch. in a fresh setup I ran:

schematool -dbType mysql -initSchema

and started HS2 and HMS services and then ran

schematool -dbType hive -initSchema -metaDbType mysql

but the "CREATE DATABASE IF NOT EXISTS SYS" didn't went to HS2 like it happened based on the logs attached in the JIRA. I suspect schematool is running with embedded mode in my case. What am I missing?

the logs are confusing, I was shocked to see things like "HiveServer2-Background-Pool" in a metastore log or metastore-related log, but I guess as long as the hive executable runs, it will utilize a Hive CLI call to do things, and for that, the Driver is the entrypoint (so it runs the hooks too), and that process doesn't have the same classpath as a fully fledged HS2 process on a Cloudera Manager based cluster, the tez-protobuf-history-plugin jar is already added for HS2 classpath, that's why we haven't hit that problem there

I'm wondering what's the classpath when you call schematool and if you actually hit the Driver classpath with the hooks

abstractdog avatar Jun 12 '25 09:06 abstractdog

@abstractdog , what you said above is correct but as per my understanding, schematool will launch 1 jvm which will have embedded HS2 and HMS and based on the HS2 logs attached in the JIRA, it looks like user tried to ran the content of the hive-schema-*.hive.sql from the beeline.

  1. I added set -x in schematool.sh and captured the classpath and captured schematool -dbType hive -initSchema -metaDbType mysql logs as well by adding org.apache.hadoop.hive.ql.hooks.HiveProtologgingHook in HiveConf.java (pre, post and failure hooks) itself to be picked up by embedded HS2 and HMS but still didn't receive NoClassDefFoundError. Attaching the same.

schema_logs.txt classpath.txt

  1. Can you please share the steps to repro this issue in hive4 if possible?

Aggarwal-Raghav avatar Jun 12 '25 16:06 Aggarwal-Raghav

@abstractdog , what you said above is correct but as per my understanding, schematool will launch 1 jvm which will have embedded HS2 and HMS and based on the HS2 logs attached in the JIRA, it looks like user tried to ran the content of the hive-schema-*.hive.sql from the beeline.

  1. I added set -x in schematool.sh and captured the classpath and captured schematool -dbType hive -initSchema -metaDbType mysql logs as well by adding org.apache.hadoop.hive.ql.hooks.HiveProtologgingHook in HiveConf.java (pre, post and failure hooks) itself to be picked up by embedded HS2 and HMS but still didn't receive NoClassDefFoundError. Attaching the same.

schema_logs.txt classpath.txt

  1. Can you please share the steps to repro this issue in hive4 if possible?

thanks for taking a look I can see this in the logs you attached:

2025-06-12 21:28:38,225 ERROR hooks.HiveProtoLoggingHook: hive.hook.proto.base-directory is not set, logging disabled.

abstractdog avatar Jun 16 '25 11:06 abstractdog

Yes, that is correct. After setting the hive.hook.proto.base-directory , i found another issue, which I filed under HIVE-29012 but even after the potential fix in the respective PR, I am not seeing NoClassDefFoundError πŸ˜….

updated_schemtool_logs.txt

Aggarwal-Raghav avatar Jun 16 '25 11:06 Aggarwal-Raghav

Yes, that is correct. After setting the hive.hook.proto.base-directory , i found another issue, which I filed under HIVE-29012 but even after the potential fix in the respective PR, I am not seeing NoClassDefFoundError πŸ˜….

updated_schemtool_logs.txt

hm, in this case, one of the jars on the classpath must contain DatePartitionedLogger, otherwise it should fail, can you please check that?

abstractdog avatar Jun 16 '25 12:06 abstractdog

Yes, that is correct. After setting the hive.hook.proto.base-directory , i found another issue, which I filed under HIVE-29012 but even after the potential fix in the respective PR, I am not seeing NoClassDefFoundError πŸ˜…. updated_schemtool_logs.txt

hm, in this case, one of the jars on the classpath must contain DatePartitionedLogger, otherwise it should fail, can you please check that?

hmm, the following line in logs in schematool as well confirms that post hooks is getting triggered but in the classpath (attached above), using the find command with jar tvf , I am not able to find tez-protobuf-history-plugin jar or DatePartitionedLogger.class:

2025-06-16 17:34:20,489 INFO ql.Driver: Executing command(queryId=raghav_20250616173420_39d18e25-c007-483c-b692-0f6dfc39832f): SELECT t.SCHEMA_VERSION from VERSION t
2025-06-16 17:34:20,489 INFO hooks.HiveProtoLoggingHook: Received pre-hook notification for: raghav_20250616173420_39d18e25-c007-483c-b692-0f6dfc39832f
2025-06-16 17:34:20,490 INFO hooks.HiveProtoLoggingHook: Received post-hook notification for: raghav_20250616173420_39d18e25-c007-483c-b692-0f6dfc39832f
2025-06-16 17:34:20,490 INFO metadata.Hive: Dumping metastore api call timing information for : execution phase

I will also debug and see how schematool is working or if something else is happening altogether, but if you can request someone as well to test on hive4 setup that will be great, it will eliminate setup issues if any.

Aggarwal-Raghav avatar Jun 16 '25 12:06 Aggarwal-Raghav

Yes, that is correct. After setting the hive.hook.proto.base-directory , i found another issue, which I filed under HIVE-29012 but even after the potential fix in the respective PR, I am not seeing NoClassDefFoundError πŸ˜…. updated_schemtool_logs.txt

hm, in this case, one of the jars on the classpath must contain DatePartitionedLogger, otherwise it should fail, can you please check that?

hmm, the following line in logs in schematool as well confirms that post hooks is getting triggered but in the classpath (attached above), using the find command with jar tvf , I am not able to find tez-protobuf-history-plugin jar or DatePartitionedLogger.class:

2025-06-16 17:34:20,489 INFO ql.Driver: Executing command(queryId=raghav_20250616173420_39d18e25-c007-483c-b692-0f6dfc39832f): SELECT t.SCHEMA_VERSION from VERSION t
2025-06-16 17:34:20,489 INFO hooks.HiveProtoLoggingHook: Received pre-hook notification for: raghav_20250616173420_39d18e25-c007-483c-b692-0f6dfc39832f
2025-06-16 17:34:20,490 INFO hooks.HiveProtoLoggingHook: Received post-hook notification for: raghav_20250616173420_39d18e25-c007-483c-b692-0f6dfc39832f
2025-06-16 17:34:20,490 INFO metadata.Hive: Dumping metastore api call timing information for : execution phase

I will also debug and see how schematool is working or if something else is happening altogether, but if you can request someone as well to test on hive4 setup that will be great, it will eliminate setup issues if any.

We could only test and encounter this issue on a downstream cluster, which has diverged from Hive 4.

In this case, I can only rely on the fact that this patch resolved the problem in our environment, but I agreeβ€”a reproducible case on Hive master would definitely make the patch more trustworthy.

abstractdog avatar Jun 16 '25 12:06 abstractdog

@abstractdog Are you looking for this :-)

ayushsaxena@ayushsaxena apache-hive-4.1.0-SNAPSHOT-bin % bin/schematool -dbType hive -metaDbType derby  -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/ayushsaxena/code/hive/packaging/target/apache-hive-4.1.0-SNAPSHOT-bin/apache-hive-4.1.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.24.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/ayushsaxena/code/incubator/hadoop-3.4.1/hadoop-3.4.1-RC3/hadoop-3.4.1/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2025-06-16T20:51:52.848390Z main WARN The use of package scanning to locate Log4j plugins is deprecated.
Please remove the `packages` attribute from your configuration file.
See https://logging.apache.org/log4j/2.x/faq.html#package-scanning for details.
2025-06-16T20:51:52.908646Z main INFO Starting configuration org.apache.logging.log4j.core.config.properties.PropertiesConfiguration@56673b2c...
2025-06-16T20:51:52.908753Z main INFO Start watching for changes to jar:file:/Users/ayushsaxena/code/hive/packaging/target/apache-hive-4.1.0-SNAPSHOT-bin/apache-hive-4.1.0-SNAPSHOT-bin/lib/hive-common-4.1.0-SNAPSHOT.jar!/hive-log4j2.properties every 0 seconds
2025-06-16T20:51:52.908952Z main INFO Configuration org.apache.logging.log4j.core.config.properties.PropertiesConfiguration@56673b2c started.
2025-06-16T20:51:52.910037Z main INFO Stopping configuration org.apache.logging.log4j.core.config.DefaultConfiguration@3d5c822d...
2025-06-16T20:51:52.910209Z main INFO Configuration org.apache.logging.log4j.core.config.DefaultConfiguration@3d5c822d stopped.
Initializing the schema to: 4.1.0
Metastore connection URL:	 jdbc:hive2://?hive.conf.restricted.list=;hive.security.authorization.sqlstd.confwhitelist=.*;hive.security.authorization.sqlstd.confwhitelist.append=.*;hive.security.authorization.enabled=false;hive.metastore.uris=;hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory;hive.support.concurrency=false;hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager;hive.metastore.rawstore.impl=org.apache.hadoop.hive.metastore.ObjectStore
Metastore connection Driver :	 org.apache.hive.jdbc.HiveDriver
Metastore connection User:	 APP
Hive Session ID = bd67b1bc-3dcb-4645-99da-d29d516920d5
Starting metastore schema initialization to 4.1.0
Initialization script hive-schema-4.1.0.hive.sql
Hive Session ID = 9479a44c-4250-4269-90a1-368b1c70f8aa
2025-06-16T20:51:56.103786Z main ERROR Log4j2 ConfigurationScheduler attempted to increment scheduled items after start


FAILED: Hive Internal Error: java.lang.NoClassDefFoundError(org/apache/tez/dag/history/logging/proto/DatePartitionedLogger)
java.lang.NoClassDefFoundError: org/apache/tez/dag/history/logging/proto/DatePartitionedLogger
	at org.apache.hadoop.hive.ql.hooks.HiveProtoLoggingHook$EventLogger.<init>(HiveProtoLoggingHook.java:207)
	at org.apache.hadoop.hive.ql.hooks.HiveProtoLoggingHook$EventLogger.getInstance(HiveProtoLoggingHook.java:545)
	at org.apache.hadoop.hive.ql.hooks.HiveProtoLoggingHook.run(HiveProtoLoggingHook.java:557)
	at org.apache.hadoop.hive.ql.HookRunner.invokeGeneralHook(HookRunner.java:269)
	at org.apache.hadoop.hive.ql.HookRunner.runPreHooks(HookRunner.java:248)
	at org.apache.hadoop.hive.ql.Executor.preExecutionActions(Executor.java:153)
	at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:101)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:345)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:189)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:142)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:137)
	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:190)
	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:234)
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:334)
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
	at java.base/javax.security.auth.Subject.doAs(Subject.java:439)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:354)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: java.lang.ClassNotFoundException: org.apache.tez.dag.history.logging.proto.DatePartitionedLogger
	at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
	at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525)
	... 23 more

Error: Error while compiling statement: FAILED: Hive Internal Error: java.lang.NoClassDefFoundError(org/apache/tez/dag/history/logging/proto/DatePartitionedLogger); Query ID: ayushsaxena_20250617022156_76f0c477-9e01-47ae-a564-c3ab0a1f9472 (state=08S01,code=12)
Schema initialization FAILED! Metastore state would be inconsistent!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
Use --verbose for detailed stacktrace.
*** schemaTool failed ***

ayushtkn avatar Jun 17 '25 11:06 ayushtkn

bin/schematool -dbType hive -metaDbType derby -initSchema

nice repro! yes, this is what we hit downstream too

abstractdog avatar Jun 17 '25 11:06 abstractdog

Thanks @ayushtkn, for provinding the repro. @abstractdog, I debugged and found the reason why it is not happening in my setup.

Setting set -x in schematool.sh is not the right way to identify the classpath. I modified the HiveSchemaTool#main() to include

String classpath = System.getProperty("java.class.path");
System.out.println("Classpath: " + classpath);

there I can see the tez-protobuf-history-plugin jar. This is coming because in my setup I have set the following in $HADOOP_HOME/etc/hadoop/hadoop-env.sh

export HADOOP_CLASSPATH=${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*:$HADOOP_CLASSPATH

Attaching updated classpath for reference: classpath.txt

Aggarwal-Raghav avatar Jun 17 '25 14:06 Aggarwal-Raghav

LGTM +1

Aggarwal-Raghav avatar Jun 17 '25 14:06 Aggarwal-Raghav