OpenMetadata icon indicating copy to clipboard operation
OpenMetadata copied to clipboard

Oracle connector returns lowercased names of schemas and table columns

Open brown3007 opened this issue 1 year ago • 5 comments

Affected module OpenMetadata Oracle connector

Describe the bug Dear all, When using Oracle connector in OpenMetadata 1.2.3 found that schema and column names are all turned to lowercase. Could you kindly help us to resolve this problem? For further use we need all names as they are in the source. Thank you!

Expected behavior All names taken from source must be in the same case, no lower- or uppercasing.

Version:

  • OpenMetadata version: 1.2.3

brown3007 avatar Mar 06 '24 12:03 brown3007

Hey @brown3007 can you upgarde to latest stable version and let us know if you are still experiencing the same issue.

TeddyCr avatar Mar 21 '24 10:03 TeddyCr

Hi @TeddyCr I have upgraded to 1.3.1 and Oracle connector ingestion refuses to work at all Testing connection is OK, but when trying to run ingestion task it fails in Airflow, and in OpenMetadata UI it is shown as the ingestion never started. In 1.2.3 it works fine except for lowercasing column names, as described above.

Here are my logs from Airflow:

************
*** Found local files:
***   * /opt/airflow/logs/dag_id=ora_test_metadata/run_id=manual__2024-03-21T15:45:42+00:00/task_id=ingestion_task/attempt=1.log
[2024-03-21, 18:45:43 MSK] {taskinstance.py:1159} INFO - Dependencies all met for dep_context=non-requeueable deps ti=<TaskInstance: ora_test_metadata.ingestion_task manual__2024-03-21T15:45:42+00:00 [queued]>
[2024-03-21, 18:45:43 MSK] {taskinstance.py:1159} INFO - Dependencies all met for dep_context=requeueable deps ti=<TaskInstance: ora_test_metadata.ingestion_task manual__2024-03-21T15:45:42+00:00 [queued]>
[2024-03-21, 18:45:43 MSK] {taskinstance.py:1361} INFO - Starting attempt 1 of 1
[2024-03-21, 18:45:43 MSK] {taskinstance.py:1382} INFO - Executing <Task(CustomPythonOperator): ingestion_task> on 2024-03-21 15:45:42+00:00
[2024-03-21, 18:45:43 MSK] {standard_task_runner.py:57} INFO - Started process 901 to run task
[2024-03-21, 18:45:43 MSK] {standard_task_runner.py:84} INFO - Running: ['airflow', 'tasks', 'run', 'ora_test_metadata', 'ingestion_task', 'manual__2024-03-21T15:45:42+00:00', '--job-id', '9', '--raw', '--subdir', 'DAGS_FOLDER/ora_test_metadata.py', '--cfg-path', '/tmp/tmpk7_5idsr']
[2024-03-21, 18:45:43 MSK] {standard_task_runner.py:85} INFO - Job 9: Subtask ingestion_task
[2024-03-21, 18:45:44 MSK] {task_command.py:416} INFO - Running <TaskInstance: ora_test_metadata.ingestion_task manual__2024-03-21T15:45:42+00:00 [running]> on host ************
[2024-03-21, 18:45:44 MSK] {taskinstance.py:1662} INFO - Exporting env vars: AIRFLOW_CTX_DAG_OWNER='admin' AIRFLOW_CTX_DAG_ID='ora_test_metadata' AIRFLOW_CTX_TASK_ID='ingestion_task' AIRFLOW_CTX_EXECUTION_DATE='2024-03-21T15:45:42+00:00' AIRFLOW_CTX_TRY_NUMBER='1' AIRFLOW_CTX_DAG_RUN_ID='manual__2024-03-21T15:45:42+00:00'
[2024-03-21, 18:45:44 MSK] {server_mixin.py:66} INFO - OpenMetadata client running with Server version [1.3.1] and Client version [1.3.1.3]
[2024-03-21, 18:45:44 MSK] {connection.py:114} INFO - Initializing Oracle thick client at /instantclient
[2024-03-21, 18:45:45 MSK] {test_connections.py:198} WARNING - GetTables-(oracledb.exceptions.DatabaseError) ORA-00942: table or view does not exist
[SQL: /* {"app": "OpenMetadata", "version": "1.3.1.3"} */
SELECT table_name FROM DBA_TABLES WHERE 
nvl(tablespace_name, 'no tablespace') NOT IN ('SYSTEM', 'SYSAUX') AND 
OWNER = :owner  
AND IOT_NAME IS NULL 
AND DURATION IS NULL
AND TABLE_NAME NOT IN 
(SELECT mview_name FROM DBA_MVIEWS WHERE owner = :owner)
]
[parameters: {'owner': 'READER'}]
(Background on this error at: https://sqlalche.me/e/14/4xp6)
[2024-03-21, 18:45:45 MSK] {test_connections.py:215} INFO - Test connection results:
[2024-03-21, 18:45:45 MSK] {test_connections.py:216} INFO - failed=['\'GetTables\': This is a mandatory step and we won\'t be able to extract necessary metadata. Failed due to: (oracledb.exceptions.DatabaseError) ORA-00942: table or view does not exist\n[SQL: /* {"app": "OpenMetadata", "version": "1.3.1.3"} */\n\nSELECT table_name FROM DBA_TABLES WHERE \nnvl(tablespace_name, \'no tablespace\') NOT IN (\'SYSTEM\', \'SYSAUX\') AND \nOWNER = :owner  \nAND IOT_NAME IS NULL \nAND DURATION IS NULL\nAND TABLE_NAME NOT IN \n(SELECT mview_name FROM DBA_MVIEWS WHERE owner = :owner)\n]\n[parameters: {\'owner\': \'READER\'}]\n(Background on this error at: https://sqlalche.me/e/14/4xp6)'] success=["'CheckAccess': Pass", "'GetSchemas': Pass", "'GetViews': Pass"] warning=[]
[2024-03-21, 18:45:45 MSK] {taskinstance.py:1937} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 192, in execute
    return_value = self.execute_callable()
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 209, in execute_callable
    return self.python_callable(*self.op_args, **self.op_kwargs)
  File "/home/airflow/.local/lib/python3.10/site-packages/openmetadata_managed_apis/workflows/ingestion/common.py", line 206, in metadata_ingestion_workflow
    workflow = MetadataWorkflow.create(config)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/ingestion.py", line 102, in create
    return cls(config)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/ingestion.py", line 78, in __init__
    super().__init__(
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/base.py", line 106, in __init__
    self.post_init()
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/ingestion.py", line 97, in post_init
    self.set_steps()
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/metadata.py", line 35, in set_steps
    self.source = self._get_source()
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/metadata.py", line 60, in _get_source
    source: Source = source_class.create(self.config.source.dict(), self.metadata)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/oracle/metadata.py", line 114, in create
    return cls(config, metadata)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/common_db_source.py", line 109, in __init__
    self.test_connection()
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/database_service.py", line 583, in test_connection
    test_connection_fn(self.metadata, self.connection_obj, self.service_connection)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/oracle/connection.py", line 139, in test_connection
    test_connection_db_common(
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 317, in test_connection_db_common
    test_connection_steps(
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 265, in test_connection_steps
    return timeout(timeout_seconds)(_test_connection_steps)(
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/utils/timeout.py", line 54, in inner
    result = fn(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 104, in _test_connection_steps
    _test_connection_steps_during_ingestion(steps=steps)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 219, in _test_connection_steps_during_ingestion
    raise SourceConnectionException(
metadata.ingestion.connections.test_connections.SourceConnectionException: Some steps failed when testing the connection: [failed=['\'GetTables\': This is a mandatory step and we won\'t be able to extract necessary metadata. Failed due to: (oracledb.exceptions.DatabaseError) ORA-00942: table or view does not exist\n[SQL: /* {"app": "OpenMetadata", "version": "1.3.1.3"} */\n\nSELECT table_name FROM DBA_TABLES WHERE \nnvl(tablespace_name, \'no tablespace\') NOT IN (\'SYSTEM\', \'SYSAUX\') AND \nOWNER = :owner  \nAND IOT_NAME IS NULL \nAND DURATION IS NULL\nAND TABLE_NAME NOT IN \n(SELECT mview_name FROM DBA_MVIEWS WHERE owner = :owner)\n]\n[parameters: {\'owner\': \'READER\'}]\n(Background on this error at: https://sqlalche.me/e/14/4xp6)'] success=["'CheckAccess': Pass", "'GetSchemas': Pass", "'GetViews': Pass"] warning=[]]
[2024-03-21, 18:45:45 MSK] {taskinstance.py:1400} INFO - Marking task as FAILED. dag_id=ora_test_metadata, task_id=ingestion_task, execution_date=20240321T154542, start_date=20240321T154543, end_date=20240321T154545
[2024-03-21, 18:45:45 MSK] {common.py:283} INFO - Sending failed status from callback...
[2024-03-21, 18:45:45 MSK] {server_mixin.py:66} INFO - OpenMetadata client running with Server version [1.3.1] and Client version [1.3.1.3]
[2024-03-21, 18:45:45 MSK] {common.py:289} INFO - Sending status to Ingestion Pipeline tis_test.ora_test_metadata
[2024-03-21, 18:45:45 MSK] {taskinstance.py:1705} ERROR - Error when executing partial callback
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 1518, in _run_raw_task
    self._execute_task_with_callbacks(context, test_mode, session=session)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 1681, in _execute_task_with_callbacks
    result = self._execute_task(context, task_orig)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 1744, in _execute_task
    result = execute_callable(context=context, **execute_callable_kwargs)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 192, in execute
    return_value = self.execute_callable()
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 209, in execute_callable
    return self.python_callable(*self.op_args, **self.op_kwargs)
  File "/home/airflow/.local/lib/python3.10/site-packages/openmetadata_managed_apis/workflows/ingestion/common.py", line 206, in metadata_ingestion_workflow
    workflow = MetadataWorkflow.create(config)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/ingestion.py", line 102, in create
    return cls(config)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/ingestion.py", line 78, in __init__
    super().__init__(
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/base.py", line 106, in __init__
    self.post_init()
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/ingestion.py", line 97, in post_init
    self.set_steps()
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/metadata.py", line 35, in set_steps
    self.source = self._get_source()
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/metadata.py", line 60, in _get_source
    source: Source = source_class.create(self.config.source.dict(), self.metadata)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/oracle/metadata.py", line 114, in create
    return cls(config, metadata)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/common_db_source.py", line 109, in __init__
    self.test_connection()
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/database_service.py", line 583, in test_connection
    test_connection_fn(self.metadata, self.connection_obj, self.service_connection)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/oracle/connection.py", line 139, in test_connection
    test_connection_db_common(
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 317, in test_connection_db_common
    test_connection_steps(
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 265, in test_connection_steps
    return timeout(timeout_seconds)(_test_connection_steps)(
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/utils/timeout.py", line 54, in inner
    result = fn(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 104, in _test_connection_steps
    _test_connection_steps_during_ingestion(steps=steps)
  File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 219, in _test_connection_steps_during_ingestion
    raise SourceConnectionException(
metadata.ingestion.connections.test_connections.SourceConnectionException: Some steps failed when testing the connection: [failed=['\'GetTables\': This is a mandatory step and we won\'t be able to extract necessary metadata. Failed due to: (oracledb.exceptions.DatabaseError) ORA-00942: table or view does not exist\n[SQL: /* {"app": "OpenMetadata", "version": "1.3.1.3"} */\n\nSELECT table_name FROM DBA_TABLES WHERE \nnvl(tablespace_name, \'no tablespace\') NOT IN (\'SYSTEM\', \'SYSAUX\') AND \nOWNER = :owner  \nAND IOT_NAME IS NULL \nAND DURATION IS NULL\nAND TABLE_NAME NOT IN \n(SELECT mview_name FROM DBA_MVIEWS WHERE owner = :owner)\n]\n[parameters: {\'owner\': \'READER\'}]\n(Background on this error at: https://sqlalche.me/e/14/4xp6)'] success=["'CheckAccess': Pass", "'GetSchemas': Pass", "'GetViews': Pass"] warning=[]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 1702, in _run_finished_callback
    callback(context)
  File "/home/airflow/.local/lib/python3.10/site-packages/openmetadata_managed_apis/workflows/ingestion/common.py", line 296, in send_failed_status_callback
    pipeline_status.endDate = datetime.now().timestamp() * 1000
AttributeError: 'NoneType' object has no attribute 'endDate'
[2024-03-21, 18:45:45 MSK] {standard_task_runner.py:104} ERROR - Failed to execute job 9 for task ingestion_task (Some steps failed when testing the connection: [failed=['\'GetTables\': This is a mandatory step and we won\'t be able to extract necessary metadata. Failed due to: (oracledb.exceptions.DatabaseError) ORA-00942: table or view does not exist\n[SQL: /* {"app": "OpenMetadata", "version": "1.3.1.3"} */\n\nSELECT table_name FROM DBA_TABLES WHERE \nnvl(tablespace_name, \'no tablespace\') NOT IN (\'SYSTEM\', \'SYSAUX\') AND \nOWNER = :owner  \nAND IOT_NAME IS NULL \nAND DURATION IS NULL\nAND TABLE_NAME NOT IN \n(SELECT mview_name FROM DBA_MVIEWS WHERE owner = :owner)\n]\n[parameters: {\'owner\': \'READER\'}]\n(Background on this error at: https://sqlalche.me/e/14/4xp6)'] success=["'CheckAccess': Pass", "'GetSchemas': Pass", "'GetViews': Pass"] warning=[]]; 901)
[2024-03-21, 18:45:45 MSK] {local_task_job_runner.py:228} INFO - Task exited with return code 1
[2024-03-21, 18:45:45 MSK] {taskinstance.py:2778} INFO - 0 downstream tasks scheduled from follow-on schedule check

brown3007 avatar Mar 22 '24 07:03 brown3007

@brown3007 it seems like a permission issue maybe, if you can give permissions to the user for DBA_* views, maybe something like a grant select any dictionary to john; something like this and try it testing it out again?

ayush-shah avatar Mar 22 '24 07:03 ayush-shah

@ayush-shah connection options are the same in 1.3.1 and in 1.2.3, but in 1.2.3 it works and I can see tables in OpenMetadata, but in 1.3.1 it doesn't Were there any changes for Oracle connector between 1.2.3 and 1.3.1? If so, it can be permission issue, but if connector stays the same the problem isn't on the Oracle side

brown3007 avatar Mar 22 '24 07:03 brown3007

@brown3007 we have made query changes where previously it was all_, we migrated it to dba_ Here is the history of commits that were changed in couple of months: https://github.com/open-metadata/OpenMetadata/commits/main/ingestion/src/metadata/ingestion/source/database/oracle/queries.py

ayush-shah avatar Mar 22 '24 07:03 ayush-shah

Closing this, as it doesn't break anything..we will reopen if there seems be better suggestions to handle complex migrations.

ayush-shah avatar Apr 29 '24 13:04 ayush-shah

@ayush-shah I am currently using version 1.4.4 of OpenMetadata, but the issue with Oracle field name case sensitivity still exists. Will this fix be merged into the new version in the future?

zchunhai avatar Aug 22 '24 02:08 zchunhai

@ayush-shah Hello! After upgrading to 1.4.4 the problem did not resolve. Found out the following: When using SQLAlchemy Inspector in Oracle connector we face a problem with case sensitivity/insensitivity If we check Oracle case insensitive table/schema/column names, they all will be uppercase SQLAlchemy sees all them lowercased

But we cannot just change case if we need some integrations

The bug we found: In Oracle we have two tables with same name except for case (Oracle allows us to do so): orcl3 orcl4

But if we see them in OpenMetadata: orcl1 orcl2

They have completely similar structure and (probably) changed casing.

Could you please check this case and find some workarounds to solve the problem?

brown3007 avatar Aug 30 '24 12:08 brown3007