amazon-redshift-python-driver icon indicating copy to clipboard operation
amazon-redshift-python-driver copied to clipboard

Incorrect buffer size calculation leads to failed query execution

Open JAGalvis opened this issue 3 years ago • 6 comments

Driver version

2.0.908

Redshift version

PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.40677

Client Operating System

ubuntu:20.04

Python version

3.9.12

Table schema

NA

Problem description

Error is raised every time a query lasts more than 5 minutes. Works fine on simpler/faster queries.

  1. Expected behaviour: No errors when executing long lasting queries.
  2. Actual behaviour: Errors out because of a buffer size miscalculation.
  3. Error message/stack trace: unpack_from requires a buffer of at least 5 bytes for unpacking 5 bytes at offset 0 (actual buffer size is 0)
  4. Any other details that can be helpful: Same code runs fine when executing on Windows. So far, only fails when running on Ubuntu within a Docker container. I have also tried the workaround posted here, without any luck.

Python Driver trace logs

Traceback (most recent call last): File "/app/./test_long_query.py", line 43, in cursor.execute(query) File "/usr/local/lib/python3.9/dist-packages/redshift_connector/cursor.py", line 240, in execute self._c.execute(self, operation, args) File "/usr/local/lib/python3.9/dist-packages/redshift_connector/core.py", line 1783, in execute self.handle_messages(cursor) File "/usr/local/lib/python3.9/dist-packages/redshift_connector/core.py", line 1965, in handle_messages code, data_len = ci_unpack(self._read(5)) struct.error: unpack_from requires a buffer of at least 5 bytes for unpacking 5 bytes at offset 0 (actual buffer size is 0)

Reproduction code

(Query is redacted for confidentiality)

import pandas as pd import redshift_connector

credentials = CREDENTIALS_DICTIONARY query= LARGE_QUERY # Query that takes over 5 minutes to execute

with redshift_connector.connect( host=credentials['host'], database=credentials['dbname'], user=credentials['user'], password=credentials['password'], port=credentials['port'] ) as conn: with conn.cursor() as cursor: cursor.execute(query) result: pd.DataFrame = cursor.fetch_dataframe() print(result.head())

JAGalvis avatar Aug 16 '22 12:08 JAGalvis

Thanks for reporting issue. We will look into it and get back to you.

iggarish avatar Aug 16 '22 18:08 iggarish

I'm unable to reproduce this issue using redshift_connector==2.0.908 and Redshift server version 1.0.40325. I recommend you look into the Docker container configuration to ensure tcp keepalive is sufficiently large enough that there is not a timeout.

with redshift_connector.connect(
    region='xxx',
    user='xxx',
    host='xxx',
    password='xxx',
    database='xxx',
) as conn:
    with conn.cursor() as cursor:
        print('creating udf...')
        cursor.execute("""CREATE OR REPLACE FUNCTION janky_sleep (x float) RETURNS bool IMMUTABLE as $$
    from time import sleep
    sleep(x)
    return True
$$ LANGUAGE plpythonu;""")
        print('created udf')
        cursor.execute("select '6 min sleep', janky_sleep(360.0);") # sleep for 6 minutes
        print(cursor.fetchall())
creating udf...
created udf
(['6 min sleep', True],)
redshift_connector version: 2.0.908

Process finished with exit code 0

Brooke-white avatar Aug 22 '22 19:08 Brooke-white

We are looking into the Docker OS with no luck yet. Will report back in the following days as soon as we have news.

JAGalvis avatar Aug 29 '22 13:08 JAGalvis

Thanks a lot Brooke, as you pointed it seems to be a problem with docker. We have not been able to fix it but we don't think is related to the library as we are having similar issues with long queries while using other libraries as well.

If we find a way to fix it I'll post it.

JAGalvis avatar Sep 06 '22 08:09 JAGalvis

thanks for the update and persistence in investigating this, Javier. I'll keep this issue open for now incase anyone else is hitting something similar.

Brooke-white avatar Sep 06 '22 16:09 Brooke-white

I'm hitting a similar issue, I'm not using Docker and my queries are not long running. Initially connection works fine and queries go through, then in about 10 minutes any query results in:

  File "/Users/leo/Library/Caches/pypoetry/virtualenvs/di-api-LPP91Z5g-py3.10/lib/python3.10/site-packages/redshift_connector/core.py", line 1965, in handle_messages
    code, data_len = ci_unpack(self._read(5))
    │                │         │    └ <built-in method read of _io.BufferedRWPair object at 0x1376cbac0>
    │                │         └ <redshift_connector.core.Connection object at 0x13579f2e0>
    │                └ <built-in method unpack_from of _struct.Struct object at 0x107e27070>
    └ None
struct.error: unpack_from requires a buffer of at least 5 bytes for unpacking 5 bytes at offset 0 (actual buffer size is 0)

All subsequent queries fail with a different error:

  File "/Users/leo/Library/Caches/pypoetry/virtualenvs/di-api-LPP91Z5g-py3.10/lib/python3.10/site-packages/redshift_connector/core.py", line 1777, in execute
    self._flush()
    │    └ <built-in method flush of _io.BufferedRWPair object at 0x1376cbac0>
    └ <redshift_connector.core.Connection object at 0x13579f2e0>
  File "/Users/leo/.pyenv/versions/3.10.2/lib/python3.10/socket.py", line 723, in write
    return self._sock.send(b)
           │    │     │    └ <memory at 0x2c531d480>
           │    │     └ <function SSLSocket.send at 0x105bdcee0>
           │    └ <ssl.SSLSocket fd=19, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 53746)>
           └ <socket.SocketIO object at 0x13579f280>
  File "/Users/leo/.pyenv/versions/3.10.2/lib/python3.10/ssl.py", line 1205, in send
    return self._sslobj.write(data)
           │    │       │     └ <memory at 0x2c531d480>
           │    │       └ <method 'write' of '_ssl._SSLSocket' objects>
           │    └ <_ssl._SSLSocket object at 0x1376b4b30>
           └ <ssl.SSLSocket fd=19, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 53746)>
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2384)

Driver 2.0.909, Python 3.10, macOS, Redshift 1.0.41881

The same also observed from inside the Docker environment.

LMalikov avatar Oct 04 '22 11:10 LMalikov

I am having the same issue here using:

"redshift_connector==2.0.908", "python 3.8.1", "PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.41881"

Here is my stack trace for this error in case it helps finding the cause:

1666876698965,Traceback (most recent call last): 1666876698965," File ""/usr/local/lib/python3.8/site-packages/redshift_connector/core.py"", line 1631, in execute" 1666876698965," ps = cache[""ps""][key]" 1666876698965,"KeyError: ("" QUERY_STRING_HERE "", ())" 1666876698965,"During handling of the above exception, another exception occurred:" 1666876698965,Traceback (most recent call last): 1666876698965," File ""/core/core/importers/tenjin/tenjin_datavault.py"", line 59, in run" 1666876698965, cursor.execute(query) 1666876698965," File ""/usr/local/lib/python3.8/site-packages/redshift_connector/cursor.py"", line 240, in execute" 1666876698965," self._c.execute(self, operation, args)" 1666876698965," File ""/usr/local/lib/python3.8/site-packages/redshift_connector/core.py"", line 1701, in execute" 1666876698965, self.handle_messages(cursor) 1666876698965," File ""/usr/local/lib/python3.8/site-packages/redshift_connector/core.py"", line 1965, in handle_messages" 1666876698965," code, data_len = ci_unpack(self._read(5))" 1666876698965,struct.error: unpack_from requires a buffer of at least 5 bytes for unpacking 5 bytes at offset 0 (actual buffer size is 0)

chelo-kjml avatar Oct 27 '22 13:10 chelo-kjml