Add Support for Tableau NUMERIC types
Hi pantab team,
When trying to import a parquet file, I got an annoying error on Numeric columns:
NUMERIC(8, 2) Nullability.NULLABLE
_ColumnType(type_=SqlType.numeric(8, 2), nullability=<Nullability.NULLABLE: 1>)
ERROR:root:Something failed
ERROR:root:An exception of type AttributeError occurred. Arguments:
("'Column' object has no attribute 'nullability'",)
Traceback (most recent call last):
File "/home/ubuntu/dev/cuackcuack/venv/lib/python3.10/site-packages/pantab/_reader.py", line 32, in _read_query_result
dtypes[column.name.unescaped] = pantab_types._pandas_types[column_type]
KeyError: _ColumnType(type_=SqlType.numeric(8, 2), nullability=<Nullability.NULLABLE: 1>)
The code is here and the data I tried is an hyper api sample. As a workaround, I was able to get the desired output if I cast the o_totalprice column as double. In that case I would have to make a specific casting query for each table I will be using with numeric values. It could be that some definition for numeric is missing somewhere but I'm not sure where.
The same thing happens for BYTE_ARRAY type. In that case the workaround is to cast to TEXT.
Currently pantab doesn't support Tableau's numeric columns because there is no equivalent data type in the standard pandas type system, which was available at the time pantab was written. This could work if we made pyarrow a required dependency and mapped Tableau's numeric back to pyarrows decimal type.
Would need a community PR to make that happen - is that something you are interested in?
I would say this would make pantab usable for my project. Many thanks.
With pantab 4.0 this is getting closer to the realm of possibility. Would need the nanoarrow project to add Decimal functions as a pre-cursor first, which looks to be on the horizon
Actually I was chatting with Tableau about this on their slack channel and as it turns out Numeric data types have a very limited implementation. The Tableau tool itself does not currently work with Numeric types, and Hyper as a database will not have the ability to store Numeric data until version 3 gets released (we are at 2 currently; I do not know when 3 will happen).
For now you can really only ever read NUMERIC types in Hyper as a result of a cast expression or algorithm