OpenMetadata [WIP] Fixes <issue to open>: optimizing table profiler process for BQ

[WIP] Fixes <issue to open>: optimizing table profiler process for BQ

Open felipebmendes opened this issue 1 year ago • 1 comments

trafficstars

Describe your changes:

Fixes

Context: During the profile ingestion process, a separate query is executed for each table in the BigQuery project to retrieve metrics like rowCount, sizeInBytes, and columnCount. This approach leads to significant performance issues due to constraints like concurrent job limits and available slots in BigQuery.

Current Behavior: The profile ingestion process retrieves table-level metrics by executing a query for each table individually. While the queries already leverage the TABLES table, this method is inefficient for projects with many tables.

Proposed Solution: We propose modifying the profile ingestion process to execute a single query on the TABLES table that retrieves the required metrics for all tables in a project or schema. This would involve caching the results of this query and reading the data from the cache during the subsequent table iteration, rather than fetching metrics for each table individually.

Type of change:

[ ] Bug fix
[X] Improvement
[ ] New feature
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
[ ] Documentation

Checklist:

[x] I have read the CONTRIBUTING document.
[ ] My PR title is Fixes <issue-number>: <short explanation>
[ ] I have commented on my code, particularly in hard-to-understand areas.
[ ] For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Aug 21 '24 02:08 felipebmendes

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

Aug 21 '24 02:08 github-actions[bot]

Is this stale?

Jul 28 '25 04:07 AntoineGlacet

Is this stale?

Looks like so. I'll be closing the PR. @AntoineGlacet feel free to pick it up

Jul 28 '25 08:07 pmbrull

OpenMetadata OpenMetadata copied to clipboard

[WIP] Fixes <issue to open>: optimizing table profiler process for BQ

Describe your changes:

Type of change:

Checklist:

OpenMetadata
OpenMetadata copied to clipboard