connector-x icon indicating copy to clipboard operation
connector-x copied to clipboard

MSSQL bb8 timeout after 30 seconds on large table

Open AlexRichards77 opened this issue 9 months ago • 18 comments

What language are you using?

Python

What version are you using?

0.4.1

What database are you using?

Microsoft SQL Server

What dataframe are you using?

Polars

Can you describe your bug?

Getting RuntimeError: Timed out in bb8 when querying large table. Same code executes when limiting number of rows in query or using partition_on.

What are the steps to reproduce the behavior?

Query a large MSSQL table that takes more than 30 seconds to complete.

Example query / code

This succeeds:

query = 'select top 1000 * from my_large_table'
df = pl.read_database_uri(query=query, uri=connection)

This fails:

query = 'select * from my_large_table'
df = pl.read_database_uri(query=query, uri=connection)

This succeeds after more than a minute but I had to create a copy of the table in a different database and add the row_num column. The source I am trying to query does not have a column I can partition on.

query = 'select * from my_large_table'
df = pl.read_database_uri(query=query, uri=connection, partition_on='row_num', partition_num=10)

What is the error?

RuntimeError: Timed out in bb8.

AlexRichards77 avatar Feb 20 '25 17:02 AlexRichards77

If your connection is using user authentication and the number of available sessions in mssql was equal to 1, it may be generating the timeout.

anderson-bernard0 avatar Feb 22 '25 03:02 anderson-bernard0

Seeing the same issue, when MSSQL is queried via Polars, which uses Connector-x.

Rolling back to connectorx==0.4.0 seems to help, but a configurable command timeout would be a better solution.

kmatt avatar Feb 25 '25 16:02 kmatt

I also have the exact same issue. If the query takes too long the bb8 timeout exception appears. Downgrading to 0.4.0 fix the issue as suggested @kmatt It seems that problem comes from version bump of bb8 / bb8-tiberius and tiberius in 0.4.1 : chore: bumps to fix some issues in tiberius

sco-atolcd avatar Mar 05 '25 15:03 sco-atolcd

It seems related to this issue of tiberius. What operating system are you using? Is it macos?

wangxiaoying avatar Mar 05 '25 22:03 wangxiaoying

I don't have a test environment for mssql for now. Can anyone try the newest alpha version from test.pypi by pip install -i https://test.pypi.org/simple/ connectorx==0.4.3a1 and see whether the issue remains?

wangxiaoying avatar Mar 05 '25 23:03 wangxiaoying

Yeah sure, let me try that ! And personally I'm running Ubuntu 20.04 (i need to upgrade i know)

sco-atolcd avatar Mar 06 '25 08:03 sco-atolcd

On Windows 11 I get the following error on 0.4.3a1 and 0.4.2

Traceback (most recent call last):
  File "C:\Test\mem_test\.venv\Lib\site-packages\polars\io\database\_utils.py", line 65, in _read_sql_connectorx
    tbl = cx.read_sql(
          ^^^^^^^^^^^^
  File "C:\Test\mem_test\.venv\Lib\site-packages\connectorx\__init__.py", line 426, in read_sql
    raise ValueError(return_type)
ValueError: arrow2

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\Test\mem_test\cat_contact.py", line 66, in <module>
    df = pl.read_database_uri(query=query, uri=connection)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Test\mem_test\.venv\Lib\site-packages\polars\io\database\functions.py", line 434, in read_database_uri
    return _read_sql_connectorx(
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Test\mem_test\.venv\Lib\site-packages\polars\io\database\_utils.py", line 77, in _read_sql_connectorx
    raise type(err)(errmsg) from err
ValueError: arrow2

Downgrading to 0.4.0 also fixes it for me.

AlexRichards77 avatar Mar 06 '25 08:03 AlexRichards77

Hi @AlexRichards77 , we have removed arrow2 from the destination since 0.4.2. Can you try to set the destination as arrow?

On Windows 11 I get the following error on 0.4.3a1 and 0.4.2

Traceback (most recent call last):
  File "C:\Test\mem_test\.venv\Lib\site-packages\polars\io\database\_utils.py", line 65, in _read_sql_connectorx
    tbl = cx.read_sql(
          ^^^^^^^^^^^^
  File "C:\Test\mem_test\.venv\Lib\site-packages\connectorx\__init__.py", line 426, in read_sql
    raise ValueError(return_type)
ValueError: arrow2

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\Test\mem_test\cat_contact.py", line 66, in <module>
    df = pl.read_database_uri(query=query, uri=connection)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Test\mem_test\.venv\Lib\site-packages\polars\io\database\functions.py", line 434, in read_database_uri
    return _read_sql_connectorx(
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Test\mem_test\.venv\Lib\site-packages\polars\io\database\_utils.py", line 77, in _read_sql_connectorx
    raise type(err)(errmsg) from err
ValueError: arrow2

Downgrading to 0.4.0 also fixes it for me.

wangxiaoying avatar Mar 06 '25 08:03 wangxiaoying

I tried with 0.4.3a1, I still have the exception exactly 30 seconds after the query starts : RuntimeError: Timed out in bb8

I also get a new exception on a small table containing 8 lines : File "/home/sco/Dev/MEAE/rece/pipelines/traitements-reprise-rece/utils/dbs/scec_doc_sqlserver_utils.py", line 31, in query_with_polars_scec_docs df = cx.read_sql(conn, sql_query, return_type='polars', partition_num=partition_num, partition_on=partition_column).lazy() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sco/Dev/MEAE/rece/pipelines/pipeline_rece_venv/lib/python3.11/site-packages/connectorx/__init__.py", line 409, in read_sql result = _read_sql( ^^^^^^^^^^ RuntimeError: MsSQL get None at position: (0, 14)

Here is the sql dump of the table : bug_0.4.3a1.txt

sco-atolcd avatar Mar 06 '25 09:03 sco-atolcd

Upgrading polars to the latest version 1.24 fixes the arrow2 error but still has the timeout error.

AlexRichards77 avatar Mar 06 '25 09:03 AlexRichards77

hmm, seems like it's not caused by the tls then.

wangxiaoying avatar Mar 06 '25 09:03 wangxiaoying

Just came here because I am running into this issue as well, unfortunately. This one is really hurting me... I need the pre-execution queries for postgres, but this renders me unable to use mssql :/.

mcrumiller avatar Mar 13 '25 22:03 mcrumiller

also having timeout when version > 0.4.0

mgsnuno avatar Mar 18 '25 11:03 mgsnuno

I downgrad the tiberius version and overwrite the latest alpha version (pip install connectorx==0.4.3a1). Can anyone try whether the bb8 timeout error is still there?

wangxiaoying avatar Apr 03 '25 20:04 wangxiaoying

It works ! Thanks @wangxiaoying 👌 Still, i wonder what's the root cause in Tiberius

sco-atolcd avatar Apr 04 '25 12:04 sco-atolcd

Still, i wonder what's the root cause in Tiberius

Yeah, me too. I'll leave this issue open until we could move to the latest version to Tiberius without the timeout issue.

wangxiaoying avatar Apr 04 '25 16:04 wangxiaoying

I'm still getting this error when using 0.4.0 and 0.4.3a1 and polars 1.30.0

lickmycrit avatar Jun 11 '25 19:06 lickmycrit

Was 0.4.3a1 yanked? I see 0.4.4a1 but this gives me the timeout error, and I appear to be no longer able to install 0.4.3a1.

mcrumiller avatar Jul 07 '25 13:07 mcrumiller