OpenMetadata icon indicating copy to clipboard operation
OpenMetadata copied to clipboard

Hive Metadata Ingestion Fails on MySQL < 8.0 Due to Unsupported CTE Syntax in Metastore Dialect

Open Hql20160303 opened this issue 4 months ago • 5 comments

Affected module Ingestion Framework

Describe the bug When OpenMetadata connects to Hive using MySQL as the metastore, it fails to ingest Hive table and column metadata due to a SQL syntax error. The root cause lies in the file OpenMetadata/ingestion/src/metadata/ingestion/source/database/hive/metastore_dialects/mysql/dialect.py, which generates and uses MySQL Common Table Expressions (CTE) in its queries. However, MySQL versions prior to 8.0 do not support CTE syntax, leading to SQL execution exceptions during ingestion. While reading Hive databases works correctly, the process breaks when attempting to read tables and columns.

To Reproduce

Configure OpenMetadata (version 1.19.0) to ingest metadata from a Hive environment where the metastore is backed by MySQL (version < 8.0). Run the metadata ingestion workflow. Observe that databases are listed successfully, but table and column metadata ingestion fails. Check the logs — you will see a SQL syntax error related to the WITH clause (CTE), indicating unsupported syntax in the MySQL version being used.

Expected behavior OpenMetadata should successfully ingest Hive table and column metadata regardless of MySQL version, or at minimum, provide backward-compatible SQL queries for MySQL versions earlier than 8.0. The ingestion should not rely on CTE syntax when targeting MySQL < 8.0.

Version:

  • OS: Linux、Docker
  • Python version:
  • OpenMetadata version: 1.19.0
  • OpenMetadata Ingestion package version: openmetadata-ingestion==1.19.0

Additional context The issue is specifically triggered in the Hive metastore dialect implementation for MySQL, where modern SQL constructs like CTEs are used without version checks or fallbacks. Since many production Hive metastores still run on MySQL 5.7 or similar, this limits compatibility. A fix should either:

Detect MySQL version and use compatible SQL dialects accordingly, or Rewrite the CTE-based queries using subqueries or temporary constructs for broader compatibility.

Related file:

ingestion/src/metadata/ingestion/source/database/hive/metastore_dialects/mysql/dialect.py

Hql20160303 avatar Aug 21 '25 02:08 Hql20160303

I encountered the same problem.

xj90713 avatar Oct 29 '25 14:10 xj90713

这是来自QQ邮箱的假期自动回复邮件。   您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。

Hql20160303 avatar Oct 29 '25 14:10 Hql20160303

I encountered the same problem.

and could not collect hive table column info

need to fix it ...

darrkz avatar Oct 30 '25 01:10 darrkz

I encountered the same problem, hope to fix

zhouhongyu888 avatar Nov 09 '25 03:11 zhouhongyu888

I encountered the same problem.

and could not collect hive table column info

need to fix it ...

Excuse me, I want to known has the problem been fixed?

zhouhongyu888 avatar Nov 14 '25 02:11 zhouhongyu888