amazon-redshift-python-driver pandas None/NaN mappings

pandas None/NaN mappings

Open LuchiLucs opened this issue 9 months ago • 1 comments

I'm using write_dataframe function to write a pandas DataFrame. My context is that this dataframe containts columns of three different types:

Object (string) in pandas has None as missing data
Int64 in pandas has np.nan as missing data
Floats64 in pandas has np.nan as missing data

When writing to Redshift, these values are converted as such:

None as NULL using varchar(20) with bytedict encoding
NaN as -9223372036854775808 using BIGINT with az64 encoding
NaN as "NaN" using DOUBLE PRECISION with RAW encoding

When I try to query using SQL, based on the column, I have to filter with:

IS NULL
= -9223372036854775808
::text = "NaN"

Is this intended? I wish to map all None/NaN values of pandas into NULL values. Is this possible?

Feb 04 '25 12:02 LuchiLucs

When reading back into pandas DataFrame, for instance, the int64 columns has missing data with -9223372036854775808 instead of np.nan, resulting in a NON reproducible mapping.

Feb 04 '25 13:02 LuchiLucs

amazon-redshift-python-driver amazon-redshift-python-driver copied to clipboard

pandas None/NaN mappings

amazon-redshift-python-driver
amazon-redshift-python-driver copied to clipboard