OpenMetadata icon indicating copy to clipboard operation
OpenMetadata copied to clipboard

OpenMetadata profiler throws error when managing profiling for data of type array or json

Open fikrifikar opened this issue 3 years ago • 2 comments

Affected module The impact is an error when profiling, so that the profiling process always ends in the middle of the process

Describe the bug When we ingest profile data from google bigquery, there is a problem that causes the profiler process to issue an error status, the problem is because there are several tables containing ARRAY/JSON data types, so that every profile table containing that data type is profiled, it issues an error status.

To Reproduce image

image (4)

Expected behavior A clear and concise description of what you expected to happen.

Version:

  • OS: Windows
  • Python version: 3.8
  • OpenMetadata version: 0.11.3
  • OpenMetadata Ingestion package version: openmetadata-ingestion[bigquery]==0.11.3`]

fikrifikar avatar Aug 01 '22 06:08 fikrifikar

check if the community will work on this https://openmetadata.slack.com/archives/C02B6955S4S/p1659434355092819

Otherwise, we'll need to start planning on picking this up to deliver the fix for 0.12

pmbrull avatar Aug 04 '22 07:08 pmbrull

https://openmetadata.slack.com/archives/C02B6955S4S/p1659636703811829

The orm profiler is failing for one table for me:

TypeError: __init__() missing 1 required positional argument: 'item_type'

File "site-packages/metadata/orm_profiler/api/workflow.py", line 221, in execute
profile_and_tests: ProfilerResponse = self.processor.process(
File "site-packages/metadata/orm_profiler/processor/orm_profiler.py", line 588, in process
orm_table = ometa_to_orm(table=record, metadata=self.metadata)
File "site-packages/metadata/orm_profiler/orm/converter.py", line 120, in ometa_to_orm
cols = {
File "site-packages/metadata/orm_profiler/orm/converter.py", line 121, in <dictcomp>
str(col.name.__root__): build_orm_col(idx, col, table.serviceType)

The table it is presumably failing for has the following structure:


create table mytable
(
    string_arraya             text[],
    string_arrayb             text[],
    id                       text,
    valid_from               date,
    valid_to                 date,
    int_array                bigint[],
    __load_ts                timestamp,
    __run_id                 text
);

Many tables are profiled correctly. This one fails. 

geoHeil avatar Aug 04 '22 18:08 geoHeil