connector-x icon indicating copy to clipboard operation
connector-x copied to clipboard

SQL Server Datetime read_sql leads to PanicException

Open Jeffrey-Amst opened this issue 1 year ago • 1 comments

What language are you using?

Python

What version are you using?

Python 3.10.14

What database are you using?

Microsoft SQL Server (ver. 15.00.4360)

What dataframe are you using?

Arrow

Can you describe your bug?

I get an "panicexception: out of range Datetime. This happens since connectorx version 0.3.3. Reverting back to version 0.3.2 resolves the issue.

What are the steps to reproduce the behavior?

Getting a table from SQL Server with the following settings:

cx.read_sql( conn='mssql://etcetera', query='select * FROM [dbo].[transactions]', return_type="arrow2", )

Database setup if the error only happens on specific data or data type

Happens on datetime and datetime2 in SQL Server Table schema and example data

Example query / code

See above

What is the error?

pyo3_runtime.PanicException: out of range DateTime

Jeffrey-Amst avatar May 15 '24 12:05 Jeffrey-Amst

I had the same issue. Downgrading to version 0.3.2 fixed it

My error: thread '<unnamed>' panicked at \connector-x\connector-x\connectorx\src\destinations\arrow2\arrow_assoc.rs:312:36: out of range DateTime

AK2001 avatar May 22 '24 08:05 AK2001

We're seeing the same error on Oracle for both pandas and arrow2 return types. The error doesn't occur for every table with a datetime field (for example "SELECT SYSDATE FROM dual" works fine). Can confirm that the errors don't occur after downgrading to 0.3.2

In [4]: cx.read_sql(dwhcosu_conn, "select * from sysadm.psxlatitem")
thread '<unnamed>' panicked at src\pandas\pandas_columns\datetime.rs:84:44:
out of range DateTime
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---------------------------------------------------------------------------
PanicException                            Traceback (most recent call last)
Cell In[4], line 1
----> 1 cx.read_sql(dwhcosu_conn, "select * from sysadm.psxlatitem")

File ~\scoop\apps\miniconda3\current\envs\dbt_af39_env\lib\site-packages\connectorx\__init__.py:364, in read_sql(conn, query, return_type, protocol, partition_on, partition_range, partition_num, index_col)
    361 if return_type in {"modin", "dask", "pandas"}:
    362     try_import_module("pandas")
--> 364     result = _read_sql(
    365         conn,
    366         "pandas",
    367         queries=queries,
    368         protocol=protocol,
    369         partition_query=partition_query,
    370     )
    371     df = reconstruct_pandas(result)
    373     if index_col is not None:

PanicException: out of range DateTime

In [5]: cx.read_sql(dwhcosu_conn, "select sysdate from dual")
Out[5]:
              SYSDATE
0 2024-05-24 13:22:33

blthree avatar May 24 '24 17:05 blthree

Seems like it is related to the update for the new datetime api. Can you provide a minimum reproducible example for this? (Mainly the data insertion for the table creation)

wangxiaoying avatar May 25 '24 19:05 wangxiaoying

Can you try out the new alpha version 0.3.4a1 to see whether it fixes the issue?

wangxiaoying avatar Jun 15 '24 13:06 wangxiaoying

I will close it for now but feel free to open it if you find any issue with 0.3.4a1.

wangxiaoying avatar Jun 16 '24 15:06 wangxiaoying

@wangxiaoying Hello,

this is easily reproducuble on MS SQL, e.g.

CREATE TABLE [tempdb].[testing].[test_date_x] (d DATE);

INSERT INTO [tempdb].[testing].[test_date_x] (d) VALUES ('1601-01-01')

And then:

import connectorx as cx

urlschema = "mssql"
username = "SA"
password = "secret"
host = "127.0.0.1"
port = 1433
database = "tempdb"
dsn = f"{urlschema}://{username}:{password}@{host}:{port}/{database}"
query = "SELECT * FROM [tempdb].[testing].[test_date_x]"
table = cx.read_sql(dsn, query)
print(table)

With dates like 1980-01-01, 2024-01-01 works just fine.

rudyryk avatar Sep 16 '24 10:09 rudyryk

@wangxiaoying Using alpha releases 0.3.4a1 .. 0.3.4a2 also does not help.

rudyryk avatar Sep 16 '24 10:09 rudyryk

FYI: it looks like ConnectorX was forcing nanosecond precision for DateTimes so it could easily goes out of range for anything that isn't (relatively) close to the present time (eg: within a couple of hundred years of now).

Just checked the latest version (0.4.2), and it now loads with microsecond precision, fixing the out of range errors, which is great! 🥳

alexander-beedie avatar Mar 03 '25 06:03 alexander-beedie

I still have this issue with 0.4.3 (my workaround for now is to use return_type='polars')

matthias-Q avatar Apr 17 '25 06:04 matthias-Q