Oracle connector returns lowercased names of schemas and table columns
Affected module OpenMetadata Oracle connector
Describe the bug Dear all, When using Oracle connector in OpenMetadata 1.2.3 found that schema and column names are all turned to lowercase. Could you kindly help us to resolve this problem? For further use we need all names as they are in the source. Thank you!
Expected behavior All names taken from source must be in the same case, no lower- or uppercasing.
Version:
- OpenMetadata version: 1.2.3
Hey @brown3007 can you upgarde to latest stable version and let us know if you are still experiencing the same issue.
Hi @TeddyCr I have upgraded to 1.3.1 and Oracle connector ingestion refuses to work at all Testing connection is OK, but when trying to run ingestion task it fails in Airflow, and in OpenMetadata UI it is shown as the ingestion never started. In 1.2.3 it works fine except for lowercasing column names, as described above.
Here are my logs from Airflow:
************
*** Found local files:
*** * /opt/airflow/logs/dag_id=ora_test_metadata/run_id=manual__2024-03-21T15:45:42+00:00/task_id=ingestion_task/attempt=1.log
[2024-03-21, 18:45:43 MSK] {taskinstance.py:1159} INFO - Dependencies all met for dep_context=non-requeueable deps ti=<TaskInstance: ora_test_metadata.ingestion_task manual__2024-03-21T15:45:42+00:00 [queued]>
[2024-03-21, 18:45:43 MSK] {taskinstance.py:1159} INFO - Dependencies all met for dep_context=requeueable deps ti=<TaskInstance: ora_test_metadata.ingestion_task manual__2024-03-21T15:45:42+00:00 [queued]>
[2024-03-21, 18:45:43 MSK] {taskinstance.py:1361} INFO - Starting attempt 1 of 1
[2024-03-21, 18:45:43 MSK] {taskinstance.py:1382} INFO - Executing <Task(CustomPythonOperator): ingestion_task> on 2024-03-21 15:45:42+00:00
[2024-03-21, 18:45:43 MSK] {standard_task_runner.py:57} INFO - Started process 901 to run task
[2024-03-21, 18:45:43 MSK] {standard_task_runner.py:84} INFO - Running: ['airflow', 'tasks', 'run', 'ora_test_metadata', 'ingestion_task', 'manual__2024-03-21T15:45:42+00:00', '--job-id', '9', '--raw', '--subdir', 'DAGS_FOLDER/ora_test_metadata.py', '--cfg-path', '/tmp/tmpk7_5idsr']
[2024-03-21, 18:45:43 MSK] {standard_task_runner.py:85} INFO - Job 9: Subtask ingestion_task
[2024-03-21, 18:45:44 MSK] {task_command.py:416} INFO - Running <TaskInstance: ora_test_metadata.ingestion_task manual__2024-03-21T15:45:42+00:00 [running]> on host ************
[2024-03-21, 18:45:44 MSK] {taskinstance.py:1662} INFO - Exporting env vars: AIRFLOW_CTX_DAG_OWNER='admin' AIRFLOW_CTX_DAG_ID='ora_test_metadata' AIRFLOW_CTX_TASK_ID='ingestion_task' AIRFLOW_CTX_EXECUTION_DATE='2024-03-21T15:45:42+00:00' AIRFLOW_CTX_TRY_NUMBER='1' AIRFLOW_CTX_DAG_RUN_ID='manual__2024-03-21T15:45:42+00:00'
[2024-03-21, 18:45:44 MSK] {server_mixin.py:66} INFO - OpenMetadata client running with Server version [1.3.1] and Client version [1.3.1.3]
[2024-03-21, 18:45:44 MSK] {connection.py:114} INFO - Initializing Oracle thick client at /instantclient
[2024-03-21, 18:45:45 MSK] {test_connections.py:198} WARNING - GetTables-(oracledb.exceptions.DatabaseError) ORA-00942: table or view does not exist
[SQL: /* {"app": "OpenMetadata", "version": "1.3.1.3"} */
SELECT table_name FROM DBA_TABLES WHERE
nvl(tablespace_name, 'no tablespace') NOT IN ('SYSTEM', 'SYSAUX') AND
OWNER = :owner
AND IOT_NAME IS NULL
AND DURATION IS NULL
AND TABLE_NAME NOT IN
(SELECT mview_name FROM DBA_MVIEWS WHERE owner = :owner)
]
[parameters: {'owner': 'READER'}]
(Background on this error at: https://sqlalche.me/e/14/4xp6)
[2024-03-21, 18:45:45 MSK] {test_connections.py:215} INFO - Test connection results:
[2024-03-21, 18:45:45 MSK] {test_connections.py:216} INFO - failed=['\'GetTables\': This is a mandatory step and we won\'t be able to extract necessary metadata. Failed due to: (oracledb.exceptions.DatabaseError) ORA-00942: table or view does not exist\n[SQL: /* {"app": "OpenMetadata", "version": "1.3.1.3"} */\n\nSELECT table_name FROM DBA_TABLES WHERE \nnvl(tablespace_name, \'no tablespace\') NOT IN (\'SYSTEM\', \'SYSAUX\') AND \nOWNER = :owner \nAND IOT_NAME IS NULL \nAND DURATION IS NULL\nAND TABLE_NAME NOT IN \n(SELECT mview_name FROM DBA_MVIEWS WHERE owner = :owner)\n]\n[parameters: {\'owner\': \'READER\'}]\n(Background on this error at: https://sqlalche.me/e/14/4xp6)'] success=["'CheckAccess': Pass", "'GetSchemas': Pass", "'GetViews': Pass"] warning=[]
[2024-03-21, 18:45:45 MSK] {taskinstance.py:1937} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 192, in execute
return_value = self.execute_callable()
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 209, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/home/airflow/.local/lib/python3.10/site-packages/openmetadata_managed_apis/workflows/ingestion/common.py", line 206, in metadata_ingestion_workflow
workflow = MetadataWorkflow.create(config)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/ingestion.py", line 102, in create
return cls(config)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/ingestion.py", line 78, in __init__
super().__init__(
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/base.py", line 106, in __init__
self.post_init()
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/ingestion.py", line 97, in post_init
self.set_steps()
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/metadata.py", line 35, in set_steps
self.source = self._get_source()
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/metadata.py", line 60, in _get_source
source: Source = source_class.create(self.config.source.dict(), self.metadata)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/oracle/metadata.py", line 114, in create
return cls(config, metadata)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/common_db_source.py", line 109, in __init__
self.test_connection()
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/database_service.py", line 583, in test_connection
test_connection_fn(self.metadata, self.connection_obj, self.service_connection)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/oracle/connection.py", line 139, in test_connection
test_connection_db_common(
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 317, in test_connection_db_common
test_connection_steps(
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 265, in test_connection_steps
return timeout(timeout_seconds)(_test_connection_steps)(
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/utils/timeout.py", line 54, in inner
result = fn(*args, **kwargs)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 104, in _test_connection_steps
_test_connection_steps_during_ingestion(steps=steps)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 219, in _test_connection_steps_during_ingestion
raise SourceConnectionException(
metadata.ingestion.connections.test_connections.SourceConnectionException: Some steps failed when testing the connection: [failed=['\'GetTables\': This is a mandatory step and we won\'t be able to extract necessary metadata. Failed due to: (oracledb.exceptions.DatabaseError) ORA-00942: table or view does not exist\n[SQL: /* {"app": "OpenMetadata", "version": "1.3.1.3"} */\n\nSELECT table_name FROM DBA_TABLES WHERE \nnvl(tablespace_name, \'no tablespace\') NOT IN (\'SYSTEM\', \'SYSAUX\') AND \nOWNER = :owner \nAND IOT_NAME IS NULL \nAND DURATION IS NULL\nAND TABLE_NAME NOT IN \n(SELECT mview_name FROM DBA_MVIEWS WHERE owner = :owner)\n]\n[parameters: {\'owner\': \'READER\'}]\n(Background on this error at: https://sqlalche.me/e/14/4xp6)'] success=["'CheckAccess': Pass", "'GetSchemas': Pass", "'GetViews': Pass"] warning=[]]
[2024-03-21, 18:45:45 MSK] {taskinstance.py:1400} INFO - Marking task as FAILED. dag_id=ora_test_metadata, task_id=ingestion_task, execution_date=20240321T154542, start_date=20240321T154543, end_date=20240321T154545
[2024-03-21, 18:45:45 MSK] {common.py:283} INFO - Sending failed status from callback...
[2024-03-21, 18:45:45 MSK] {server_mixin.py:66} INFO - OpenMetadata client running with Server version [1.3.1] and Client version [1.3.1.3]
[2024-03-21, 18:45:45 MSK] {common.py:289} INFO - Sending status to Ingestion Pipeline tis_test.ora_test_metadata
[2024-03-21, 18:45:45 MSK] {taskinstance.py:1705} ERROR - Error when executing partial callback
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 1518, in _run_raw_task
self._execute_task_with_callbacks(context, test_mode, session=session)
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 1681, in _execute_task_with_callbacks
result = self._execute_task(context, task_orig)
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 1744, in _execute_task
result = execute_callable(context=context, **execute_callable_kwargs)
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 192, in execute
return_value = self.execute_callable()
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 209, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/home/airflow/.local/lib/python3.10/site-packages/openmetadata_managed_apis/workflows/ingestion/common.py", line 206, in metadata_ingestion_workflow
workflow = MetadataWorkflow.create(config)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/ingestion.py", line 102, in create
return cls(config)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/ingestion.py", line 78, in __init__
super().__init__(
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/base.py", line 106, in __init__
self.post_init()
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/ingestion.py", line 97, in post_init
self.set_steps()
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/metadata.py", line 35, in set_steps
self.source = self._get_source()
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/workflow/metadata.py", line 60, in _get_source
source: Source = source_class.create(self.config.source.dict(), self.metadata)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/oracle/metadata.py", line 114, in create
return cls(config, metadata)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/common_db_source.py", line 109, in __init__
self.test_connection()
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/database_service.py", line 583, in test_connection
test_connection_fn(self.metadata, self.connection_obj, self.service_connection)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/database/oracle/connection.py", line 139, in test_connection
test_connection_db_common(
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 317, in test_connection_db_common
test_connection_steps(
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 265, in test_connection_steps
return timeout(timeout_seconds)(_test_connection_steps)(
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/utils/timeout.py", line 54, in inner
result = fn(*args, **kwargs)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 104, in _test_connection_steps
_test_connection_steps_during_ingestion(steps=steps)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/connections/test_connections.py", line 219, in _test_connection_steps_during_ingestion
raise SourceConnectionException(
metadata.ingestion.connections.test_connections.SourceConnectionException: Some steps failed when testing the connection: [failed=['\'GetTables\': This is a mandatory step and we won\'t be able to extract necessary metadata. Failed due to: (oracledb.exceptions.DatabaseError) ORA-00942: table or view does not exist\n[SQL: /* {"app": "OpenMetadata", "version": "1.3.1.3"} */\n\nSELECT table_name FROM DBA_TABLES WHERE \nnvl(tablespace_name, \'no tablespace\') NOT IN (\'SYSTEM\', \'SYSAUX\') AND \nOWNER = :owner \nAND IOT_NAME IS NULL \nAND DURATION IS NULL\nAND TABLE_NAME NOT IN \n(SELECT mview_name FROM DBA_MVIEWS WHERE owner = :owner)\n]\n[parameters: {\'owner\': \'READER\'}]\n(Background on this error at: https://sqlalche.me/e/14/4xp6)'] success=["'CheckAccess': Pass", "'GetSchemas': Pass", "'GetViews': Pass"] warning=[]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 1702, in _run_finished_callback
callback(context)
File "/home/airflow/.local/lib/python3.10/site-packages/openmetadata_managed_apis/workflows/ingestion/common.py", line 296, in send_failed_status_callback
pipeline_status.endDate = datetime.now().timestamp() * 1000
AttributeError: 'NoneType' object has no attribute 'endDate'
[2024-03-21, 18:45:45 MSK] {standard_task_runner.py:104} ERROR - Failed to execute job 9 for task ingestion_task (Some steps failed when testing the connection: [failed=['\'GetTables\': This is a mandatory step and we won\'t be able to extract necessary metadata. Failed due to: (oracledb.exceptions.DatabaseError) ORA-00942: table or view does not exist\n[SQL: /* {"app": "OpenMetadata", "version": "1.3.1.3"} */\n\nSELECT table_name FROM DBA_TABLES WHERE \nnvl(tablespace_name, \'no tablespace\') NOT IN (\'SYSTEM\', \'SYSAUX\') AND \nOWNER = :owner \nAND IOT_NAME IS NULL \nAND DURATION IS NULL\nAND TABLE_NAME NOT IN \n(SELECT mview_name FROM DBA_MVIEWS WHERE owner = :owner)\n]\n[parameters: {\'owner\': \'READER\'}]\n(Background on this error at: https://sqlalche.me/e/14/4xp6)'] success=["'CheckAccess': Pass", "'GetSchemas': Pass", "'GetViews': Pass"] warning=[]]; 901)
[2024-03-21, 18:45:45 MSK] {local_task_job_runner.py:228} INFO - Task exited with return code 1
[2024-03-21, 18:45:45 MSK] {taskinstance.py:2778} INFO - 0 downstream tasks scheduled from follow-on schedule check
@brown3007 it seems like a permission issue maybe, if you can give permissions to the user for DBA_* views,
maybe something like a grant select any dictionary to john; something like this and try it testing it out again?
@ayush-shah connection options are the same in 1.3.1 and in 1.2.3, but in 1.2.3 it works and I can see tables in OpenMetadata, but in 1.3.1 it doesn't Were there any changes for Oracle connector between 1.2.3 and 1.3.1? If so, it can be permission issue, but if connector stays the same the problem isn't on the Oracle side
@brown3007 we have made query changes where previously it was all_, we migrated it to dba_
Here is the history of commits that were changed in couple of months: https://github.com/open-metadata/OpenMetadata/commits/main/ingestion/src/metadata/ingestion/source/database/oracle/queries.py
Closing this, as it doesn't break anything..we will reopen if there seems be better suggestions to handle complex migrations.
@ayush-shah I am currently using version 1.4.4 of OpenMetadata, but the issue with Oracle field name case sensitivity still exists. Will this fix be merged into the new version in the future?
@ayush-shah Hello! After upgrading to 1.4.4 the problem did not resolve. Found out the following: When using SQLAlchemy Inspector in Oracle connector we face a problem with case sensitivity/insensitivity If we check Oracle case insensitive table/schema/column names, they all will be uppercase SQLAlchemy sees all them lowercased
But we cannot just change case if we need some integrations
The bug we found:
In Oracle we have two tables with same name except for case (Oracle allows us to do so):
But if we see them in OpenMetadata:
They have completely similar structure and (probably) changed casing.
Could you please check this case and find some workarounds to solve the problem?