aws-sdk-pandas icon indicating copy to clipboard operation
aws-sdk-pandas copied to clipboard

s3.read_parquet_table and exception "Unknown parameter in input: "ExcludeColumnSchema", must be one of: CatalogId, DatabaseName, TableName, Expression, NextToken, Segment, MaxResults"

Open crozierm opened this issue 9 months ago • 2 comments

Describe the bug

Exception

When I use awswrangler.s3.read_parquet_table with a partition filter I get this exception:

ParamValidationError: Parameter validation failed:
Unknown parameter in input: "ExcludeColumnSchema", must be one of: CatalogId, DatabaseName, TableName, Expression, NextToken, Segment, MaxResults

Related Source

When I uncomment the offending line in awswrangler/catalog/_get.py the error goes away, but I'm not sure that is an appropriate fix.

    args: dict[str, Any] = _catalog_id(
        catalog_id=catalog_id,
        DatabaseName=database,
        TableName=table,
        MaxResults=1_000,
        Segment={"SegmentNumber": 0, "TotalSegments": 1}
        #ExcludeColumnSchema=True,
    )

Versions

awswrangler 3.7.3 boto3 1.34.99 botocore 1.34.99

How to Reproduce

import awswrangler as wr
import boto3

partition_filter = lambda x: True if x["partition_1"] == "p1" and x["partition_2"] == "p2" else False

df = wr.s3.read_parquet_table(
    table="table_name",
    database="database_name",
    boto3_session=boto3.Session(profile_name="profile_name"),
    partition_filter=partition_filter
)

Expected behavior

No response

Your project

No response

Screenshots

No response

OS

Mac

Python version

3.10.12

AWS SDK for pandas version

3.7.3

Additional context

No response

crozierm avatar May 07 '24 19:05 crozierm

Are you sure you have the latest versions of boto3 and botocore installed? As you can see here https://github.com/aws/aws-sdk-pandas/issues/1404#issuecomment-1321171171, the parameter was introduced from 1.17.4

jaidisido avatar May 09 '24 07:05 jaidisido

I saw that issue and double checked my versions before I posted. boto3.version & botocore.version both report 1.34.101 and awswrangler is 3.7.3. pip list has the same. I tested in two venvs because I thought I was making a mistake (and might still be).

I downgraded to 1.17.4 and get the same error.

I will dig deeper in the source when I have the time.

crozierm avatar May 09 '24 14:05 crozierm

Closing due to inactivity

jaidisido avatar May 30 '24 12:05 jaidisido