amazon-redshift-python-driver icon indicating copy to clipboard operation
amazon-redshift-python-driver copied to clipboard

Redshift Driver Truncates Important Information in Error Messages

Open WillAyd opened this issue 3 years ago • 14 comments

Driver version

2.0.907

Redshift version

Redshift 1.0.38698

Client Operating System

Docker python:3.10.2 image

Python version

3.10.2

Table schema

test_column: INTEGER

Problem description

When attempting to copy a file from S3 into Redshift via awswrangler, a data type mismatch will correctly throw an error. However, the error message is truncated, which makes it hard to debug the issue in non-trivial applications

Python Driver trace logs

redshift_connector.error.ProgrammingError: {'S': 'ERROR', 'C': 'XX000', 'M': 'Spectrum Scan Error', 'D': "
error: Spectrum Scan Error 
code: 15007 
context: File 'https://s3.region.amazonaws.com/bucket/bucket_directory/subdirectory/afile.snappy.parquet' has an incompatible Parquet schema for column 's3://bucket/bucket_director 
query: 1234567 
location: dory_util.cpp:1226 
process: worker_thread [pid=12345]

Reproduction code

import pandas as pd
import awswrangler as wr

df = pandas.DataFrame([[1.23]], columns="test_column")  # target schema is an integer, this is a float
wr.redshift.copy(
  df=df,
  path="s3://bucket/bucket_directory/subdirectory/",
  table="test",
)

WillAyd avatar Jun 07 '22 17:06 WillAyd

Hey @WillAyd , thank you for reporting this issue. I agree, this behavior would make debugging harder (not to mention more frustrating).

redshift_connector's error handling should just output what is received from Redshift server via postgres wire some docs here on what the incoming data looks like.

As for next steps, I will try to reproduce this locally to determine if there is an issue with redshift_connector. If not, my guess would be that Redshift server is truncating the error message. In that case, I will chat with the Redshift server team to see what our options are.

Brooke-white avatar Jun 08 '22 03:06 Brooke-white

Hey @WillAyd ,

I just noticed the error message you provided isn't complete -- there's an open quote for the error dict's "D" key but no closing quote. Could you provide the whole error message please? Not sure if just the end quote is missing but want to double check.

Brooke-white avatar Jun 09 '22 22:06 Brooke-white

Hi @Brooke-white - that is how the error gets printed to the console

WillAyd avatar Jun 14 '22 17:06 WillAyd

Hey @WillAyd , thanks for the additional information. If it's accessible to you, could you please run the statement directly with redshift_connector? The lack of closing bracket for the error dictionary makes me wonder if the error message is being truncated after it's raised in redshift_connector.

Brooke-white avatar Jun 15 '22 14:06 Brooke-white

I can't share the full traceback as this is on a work machine not connected to my GitHub account, but some debugging shows this at the top of the stack trace when the error is thrown:

https://github.com/aws/amazon-redshift-python-driver/blob/6a2ef2581770c3cd8b2a6941e164c8d7af2d4c6d/redshift_connector/core.py#L1986

If I set a breakpoint there and inspect the error before it gets raised, it appears already truncated when wrapped in the ProgrammingError object

WillAyd avatar Jun 15 '22 16:06 WillAyd

Hey @WillAyd ,

Thanks for your patience. I have been unable to reproduce this on my end. I've included my code snippit below, where I tried an s3 copy with a very long file name. The error output was not truncated. This makes me think there could be something outside of redshift_connector truncating the error messages, since its especially strange the error dict is missing the closing brace.

FWIW my test environment was using PyCharm on Mac OS. I see you're running using Docker. Could docker be truncating the error messages? If possible, I'd recommend to try running locally and see if the same issue shows up.

    with conn.cursor() as cursor:
        cursor.execute("create table catdemo(catid smallint, catgroup varchar(10), catname varchar(10), catdesc varchar(50));")
        cursor.execute("""copy catdemo
from 's3://my-test-bucket-githubissue107/Pneumonoultramicroscopicsilicovolcanoconiosis_Pneumonoultramicroscopicsilicovolcanoconiosis_Pneumonoultramicroscopicsilicovolcanoconiosis_category_pipe.txt'
iam_role 'arn:aws:iam::123456789012:role/myRedshiftRole'
FORMAT AS PARQUET;""")
>>> redshift_connector.error.ProgrammingError: {'S': 'ERROR', 'C': 'XX000', 'M': 'Spectrum Scan Error', 'D': "\n  -----------------------------------------------\n  error:  Spectrum Scan Error\n  code:      15007\n  context:   File 'https://s3.us-east-2.amazonaws.com/my-test-bucket-githubissue107/Pneumonoultramicroscopicsilicovolcanoconiosis_Pneumonoultramicroscopicsilicovolcanoconiosis_Pneumonoultramicroscopicsilicovolcanoconiosis_category_pipe.txt' has an invalid version number.\n  query:     5344549\n  location:  dory_util.cpp:1377\n  process:   worker_thread [pid=13737]\n  -----------------------------------------------\n", 'F': '../src/sys/xen_execute.cpp', 'L': '10512', 'R': 'pg_throw'}

Brooke-white avatar Jun 21 '22 15:06 Brooke-white

Docker versus running directly on the host didn't make a difference. I ran directly as you did as well and was still getting the issue.

I noticed that your error is slightly different though, as you get ...has an invalid version number.. I'm wondering if the back half of the error message is what gets truncated, as mine would say something like has an incompatiable Parquet schema for column 's3://my-test-bucket-githubissue107/Pneumonoultram\n . If you change any of the columns in the file you are loading to have the wrong data type for the target table are you getting the same issue?

WillAyd avatar Jun 21 '22 19:06 WillAyd

I'm wondering if the back half of the error message is what gets truncated, as mine would say something like has an incompatiable Parquet schema for column 's3://my-test-bucket-githubissue107/Pneumonoultram\n . If you change any of the columns in the file you are loading to have the wrong data type for the target table are you getting the same issue?

good thinking -- let me give this a try and I'll get back to you

Brooke-white avatar Jun 21 '22 21:06 Brooke-white

I believe I'm hitting the same type of error as you're seeing, though I'm not seeing the truncation of the error message, I've included repro steps below:

  1. download & upload some test parquet file I used this one
  2. view the parquet file to figure out column names https://www.parquet-viewer.com/
  3. incorrectly map column datatypes in copy statement
cursor.execute("create table test_tbl (registration_dttm datetime, id int, first_name varchar, last_name varchar, email varchar, gender varchar, ip_address varchar, cc varchar, country varchar, birthdate varchar, salary numeric(20,10), title varchar, comments numeric(20, 10));")

cursor.execute("""COPY test_tbl
FROM 's3://my-test-bucket-githubissue107/userdata1.parquet'
IAM_ROLE 'arn:aws:iam::123456789012:role/myRedshiftRole'
FORMAT AS PARQUET;""")
>>> redshift_connector.error.ProgrammingError: {'S': 'ERROR', 'C': 'XX000', 'M': 'Spectrum Scan Error', 'D': "\n  -----------------------------------------------\n  error:  Spectrum Scan Error\n  code:      15007\n  context:   File 'https://s3.us-east-2.amazonaws.com/my-test-bucket-githubissue107/userdata1.parquet' has an incompatible Parquet schema for column 's3://my-test-bucket-githubissue107/userdata1.parquet.salary'. Column type: DECIMAL(20,10), Parquet schema:\noptional double salary [i:\n  query:     5349558\n  location:  dory_util.cpp:1377\n  process:   worker_thread [pid=11807]\n  -----------------------------------------------\n", 'F': '../src/sys/xen_execute.cpp', 'L': '10512', 'R': 'pg_throw'}

Brooke-white avatar Jun 21 '22 22:06 Brooke-white

I was still getting the same issue with that file. The difference I've noticed however is that my S3 paths are much longer, and that seems to impact the error message truncation in non-obvious ways. Can you try placing the file in these locations and loading from there?

s3://my-test-bucket-githubissue107/this_is_a_subfolder/_another_one/this_is_a_very_long_folder/userdata1.parquet
s3://my-test-bucket-githubissue107/this_is_a_subfolder/_another_one/this_folder_might_be_even_longer/userdata1.parquet
s3://my-test-bucket-githubissue107/this_is_a_subfolder/_another_one/thisfolderisreallylongwithnoboundaries/userdata1.parquet

WillAyd avatar Jun 22 '22 19:06 WillAyd

Hey @WillAyd , I am finally able to somewhat reproduce! I am seeing the truncation of the server error message but the Python error message is intact (i.e. the whole error dict is there).

        cursor.execute("create table test_tbl (registration_dttm datetime, id int, first_name varchar, last_name varchar, email varchar, gender varchar, ip_address varchar, cc varchar, country varchar, birthdate varchar, salary numeric(20,10), title varchar, comments numeric(20, 10));")
        cursor.execute("""COPY test_tbl
FROM 's3://my-test-bucket-githubissue107/this_is_a_subfolder/_another_one/thisfolderisreallylongwithnoboundaries/now_its_even_longer_wow_impressive/userdata1.parquet'
IAM_ROLE 'arn:aws:iam::123456789012:role/myRedshiftRole'
FORMAT AS PARQUET;""")
redshift_connector.error.ProgrammingError: {'S': 'ERROR', 'C': 'XX000', 'M': 'Spectrum Scan Error', 'D': "\n  -----------------------------------------------\n  error:  Spectrum Scan Error\n  code:      15007\n  context:   File 'https://s3.us-east-2.amazonaws.com/my-test-bucket-githubissue107/this_is_a_subfolder/_another_one/thisfolderisreallylongwithnoboundaries/now_its_even_longer_wow_impressive/userdata1.parquet' has an incompatible Parquet schema for column 's3://my-test-bucket\n  query:     5367908\n  location:  dory_util.cpp:1377\n  process:   worker_thread [pid=27516]\n  -----------------------------------------------\n", 'F': '../src/sys/xen_execute.cpp', 'L': '10512', 'R': 'pg_throw'}

As the truncated error message lies in the middle of the data received from Redshift server, I will reach out to the Redshift server team to raise this issue

Brooke-white avatar Jun 22 '22 19:06 Brooke-white

Awesome! That's the main thing I'm after - I think everything else in Python is fine.

WillAyd avatar Jun 22 '22 19:06 WillAyd

The issue has been opened, I'll keep this open in the meantime and provide updates here once available :)

Brooke-white avatar Jun 22 '22 20:06 Brooke-white

For any readers that may come across the same issue, I've noticed that while the traceback in Python truncates this error message you can still get the full error message if you query SVL_S3LOG . Might help as a workaround

WillAyd avatar Sep 30 '22 16:09 WillAyd

i am hitting the same error here where the python traceback truncates the actual error message and doesn't include all relevant details. SVL_S3LOG is a fine work around, but for my usecase is not the most intuitive way of delivering information.

is there any update on this ticket?

justbldwn avatar Nov 15 '22 15:11 justbldwn

Hi @justbldwn -- unfortunately I do not have any update from my end. As many engineers are on holiday this week I will ping the team again next week for an update on this.

Brooke-white avatar Nov 21 '22 23:11 Brooke-white

Haven't forgot about this, I'm working to escalate the ticket I opened with the team for this issue. Ref: 6124

Brooke-white avatar Mar 17 '23 21:03 Brooke-white

Hi folks, If you are impacted by this issue I suggest to open an issue with AWS customer support and reference "Redshift-6124" to encourage prioritization of a fix for this.

Brooke-white avatar Feb 06 '24 19:02 Brooke-white