turbodbc icon indicating copy to clipboard operation
turbodbc copied to clipboard

[Oracle][ODBC]Invalid SQL data type <-25>

Open dhirschfeld opened this issue 7 years ago • 16 comments

I'm seeing the below error when trying to query one of our Oracle databases

Error: ODBC error
state: HY004
native error code: 0
message: [Oracle][ODBC]Invalid SQL data type <-25>.

Will try to narrow down the culprit when I can find the time to dig into it...

dhirschfeld avatar Jan 10 '18 02:01 dhirschfeld

Hi Dave! The data type -25 corresponds to SQL_C_SBIGINT, a signed 64-bit integer that turbodbc uses to query 64 bit fields. The MS reference has this to say about the datatypes:

All drivers must support all C data types. This is required because all drivers must support all C types to which SQL types that they support can be converted, and all drivers support at least one character SQL type. Because the character SQL type can be converted to and from all C types, all drivers must support all C types.

This seems to be a problem with the Oracle ODBC driver that apparently does not support this particular data type, see the issues here and there.

Maybe you can get around this by using an alternative or more modern ODBC driver?

MathMagique avatar Jan 10 '18 08:01 MathMagique

Thanks @MathMagique - I'll try out other drivers and report back my success... or otherwise :|

dhirschfeld avatar Jan 10 '18 11:01 dhirschfeld

Unfortunately no joy with any of:

 'Oracle in OraClient11g_x64_home1'
 'Oracle in XE'
 'Oracle in instantclient_11_2'

Querying the same database/table with cx_oracle does work so maybe they have some workaround for this issue in their codebase?

dhirschfeld avatar Jan 11 '18 02:01 dhirschfeld

The offending column is of type NUMBER(6). It seems NUMBER(6) should map to an int32 rather than an int64

dhirschfeld avatar Jan 11 '18 02:01 dhirschfeld

...since pyodbc can query the data with the same driver/SQL it would seem the issue lies in turbodbc - even if the issue is that Oracle drivers don't support the ODBC standard so require a work-around.

image

dhirschfeld avatar Jan 11 '18 03:01 dhirschfeld

https://www.ibm.com/support/knowledgecenter/en/SSCRJU_3.2.1/com.ibm.swg.im.infosphere.streams.pd.doc/doc/dbltkt-int64.html

seems suggestive?

dhirschfeld avatar Jan 11 '18 03:01 dhirschfeld

Just a few thoughts:

  • It is difficult to blame turbodbc for an ODBC driver not implementing data types required by the ODBC standard. That won't change that Oracle does not work with Turbodbc right now, though.
  • Turbodbc currently uses int64 for all kinds of integers. This choice was made because int64 is a binary format also understood by NumPy and Apache Arrow, and because a single code path is able to handle a wide range of integers from int8 to int16 over DECIMAL(9, 0) to whatever.
  • I have plans to make the integer types more specific, so that int8 would be retrieved as int8 and NUMBER(6) would be retrieved as int32. This would be more efficient, and accidentally solve your problem.
  • I could include an option to explicitly avoid 64bit integers altogether to maintain compatibility with Oracle.
  • In order to support numbers larger than int32 properly then, I would have to retrieve certain NUMERIC values with the ODBC DECIMAL type. This is what pyodbc does by default. Unfortunately, the DECIMAL type is not a universal binary standard and not very efficient. It also does not play nicely with NumPy and Arrow, so that is currently not one of my priorities. The fallback would be to retrieve "large" numbers as strings, then.

I'll flag your issue as a compatibility issue and see what I can do in the future.

MathMagique avatar Jan 11 '18 08:01 MathMagique

The decimal data type is both an IEEE standard as well as part of ANSI SQL-92.

Whilst it isn't well supported by numpy it is supported by Arrow.

The decimal type isn't useful IMHO for analytic work as it's several times slower than using floating point numbers as it doesn't have any support in hardware. Also, none of the standard numerical routines BLAS/LAPACK will work with decimal types. Nevertheless it's very commonly used for storing financial data so it would be good to have it well supported by turbodbc.

If adding support for more integer types also fixes this issue I'd be very happy with that resolution!

dhirschfeld avatar Jan 11 '18 11:01 dhirschfeld

Agreed, for financial data decimal data is essential, but was not the prime application area when creating turbodbc. I did not know that Arrow supports decimal, the fact it does is encouraging.

Side note: Here is what ODBC uses to transfer decimals:

#define SQL_MAX_NUMERIC_LEN    16
typedef struct tagSQL_NUMERIC_STRUCT
{
   SQLCHAR    precision;
   SQLSCHAR   scale;
   SQLCHAR    sign;   /* 1 if positive, 0 if negative */
   SQLCHAR    val[SQL_MAX_NUMERIC_LEN];
} SQL_NUMERIC_STRUCT;

That's 20 bytes in a weird layout that is not immediately compatibly with decimal64 or decimal128. "Nice" conversion logic will be required :-).

MathMagique avatar Jan 11 '18 11:01 MathMagique

Unfortunately the new integer support in 2.5.0 hasn't fixed this particular problem :/

dhirschfeld avatar Jan 25 '18 06:01 dhirschfeld

Yes, but that is expected, as the types used to transfer data from the database have not changed.

MathMagique avatar Jan 25 '18 08:01 MathMagique

I've experienced this problem as well, and had to use cx_Oracle for certain tables, instead of my preferred (much faster) turbodbc.

mccarthysean avatar Jan 31 '18 21:01 mccarthysean

Same issue as the others with Oracle. Is there anything I can do to help find a workaround? Unfortunately, I am not an ODBC expert (yet)

tomanizer avatar Sep 04 '18 23:09 tomanizer

We are using turbodbc heavily in our project and adding support for Decimal will help us a lot. Is there something in pipeline to add support for Decimal datatype.

I am trying to insert data to SQL server table with decimal columns from arrow table with decimal column using turbodbc executemanycolumn method and inserts fails as turbodbc does not support decimal.

joetl avatar May 09 '19 16:05 joetl

Is there a way this issue can be bumped in priority ... seems like many are affected!

+1 to the list of affected users ....

Thanks!

parachutepy avatar May 09 '19 17:05 parachutepy

Hi guys! Currently, my time for working on turbodbc is very limited, unfortunately. Even more unfortunately, this will not change in the near future. The best way to bump this issue in priority is to get active yourself, dig into the code base and provide a pull request.

If adding support for arbitrary decimal types sounds like too much, you could add optional support for using 32-bit integer types instead of 64-bit integer types. That should be easier and should help if your numbers stay below 4 billions / DECIMAL(8, 0).

Also, it would help if someone hosted a very small Oracle database that one could use for continuous integration.

MathMagique avatar May 10 '19 08:05 MathMagique