connector-x icon indicating copy to clipboard operation
connector-x copied to clipboard

Not able to load text column from sqlite as blob type

Open houqp opened this issue 2 years ago • 3 comments

What language are you using?

Python

What version are you using?

0.3.1

What database are you using?

sqlite

What dataframe are you using?

N/A

Can you describe your bug?

Querying a sqlite db results in the following error:

>>> r = cx.read_sql("sqlite://test_data/sqlite_blob/sample.db", "SELECT * FROM oc")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/houqp/Envs/roapi/lib/python3.8/site-packages/connectorx/__init__.py", line 224, in read_sql
    result = _read_sql(
RuntimeError: Invalid column type Blob at index: 9, name: raw_metadata

What are the steps to reproduce the behavior?

  1. Download sample db file: https://github.com/roapi/roapi/files/10502653/sample.db.zip
  2. Try to load the db using cx.read_sql("sqlite://test_data/sqlite_blob/sample.db", "SELECT * FROM oc")
Example query / code
cx.read_sql("sqlite://test_data/sqlite_blob/sample.db", "SELECT * FROM oc")

What is the error?

RuntimeError: Invalid column type Blob at index: 9, name: raw_metadata

houqp avatar Jan 28 '23 10:01 houqp

Hi @houqp , thanks for the sample data. Since sqlite adopts dynamic type system, there are two kind of types for its data: a declarative type (the one specified for a column when creating the table) and a data type (the true data value type), and these two types might not be able to matched.

The reason for the error is because we take the declarative type TEXT prior to data type BLOB. That is, when we fetch the metadata of each column's type, we first match its declarative type. If it is not defined (maybe because the data is computed from other columns) then we check on the data type itself. The reason we don't use data type directly is because we cannot infer data type if the data is NULL. Related logic can be found here.

wangxiaoying avatar Feb 03 '23 18:02 wangxiaoying

Could we change the metadata at runtime instead when we detected the data type doesn't match with the declarative type?

houqp avatar Feb 04 '23 22:02 houqp

Could we change the metadata at runtime instead when we detected the data type doesn't match with the declarative type?

It might require some work. Since in our current workflow we pre-allocate the final dataframe before really fetching any data. It may also introduce runtime error if there are type conflict.

wangxiaoying avatar Feb 10 '23 02:02 wangxiaoying