python-bigquery-pandas icon indicating copy to clipboard operation
python-bigquery-pandas copied to clipboard

to_gbq() fails when sending a column full of <NA> type Int64 to NULLABLE INTEGER

Open Xemnas0 opened this issue 3 years ago • 0 comments

Environment details

  • OS type and version: Windows 10
  • Python version: 3.10.4
  • pip version: 22.0.3
  • pandas-gbq version: 0.17.6

Steps to reproduce

  1. Create a table with a column "my_integers" with type INTEGER and mode NULLABLE
  2. Push a column full of <NA> with dtype="Int64"

Code example

import pandas as pd
df = pd.DataFrame([None, None, None], columns=["my_integers"]).astype("Int64")
df.to_gbq(...)

Stack trace

File ...\.venv\lib\site-packages\pandas\core\frame.py:2054, in DataFrame.to_gbq(self, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials)
   1975 """
   1976 Write a DataFrame to a Google BigQuery table.
   1977 
   (...)
   2050 read_gbq : Read a DataFrame from Google BigQuery.
   2051 """
   2052 from pandas.io import gbq
-> 2054 gbq.to_gbq(
   2055     self,
   2056     destination_table,
   2057     project_id=project_id,
   2058     chunksize=chunksize,
   2059     reauth=reauth,
   2060     if_exists=if_exists,
   2061     auth_local_webserver=auth_local_webserver,
   2062     table_schema=table_schema,
   2063     location=location,
   2064     progress_bar=progress_bar,
   2065     credentials=credentials,
   2066 )

File ...\.venv\lib\site-packages\pandas\io\gbq.py:212, in to_gbq(dataframe, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials)
    198 def to_gbq(
    199     dataframe: DataFrame,
    200     destination_table: str,
   (...)
    209     credentials=None,
    210 ) -> None:
    211     pandas_gbq = _try_import()
--> 212     pandas_gbq.to_gbq(
    213         dataframe,
    214         destination_table,
    215         project_id=project_id,
    216         chunksize=chunksize,
    217         reauth=reauth,
    218         if_exists=if_exists,
    219         auth_local_webserver=auth_local_webserver,
    220         table_schema=table_schema,
    221         location=location,
    222         progress_bar=progress_bar,
    223         credentials=credentials,
    224     )

File ...\.venv\lib\site-packages\pandas_gbq\gbq.py:1179, in to_gbq(dataframe, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials, api_method, verbose, private_key)
   1177 else:
   1178     if not pandas_gbq.schema.schema_is_subset(original_schema, table_schema):
-> 1179         raise InvalidSchema(
   1180             "Please verify that the structure and "
   1181             "data types in the DataFrame match the "
   1182             "schema of the destination table.",
   1183             table_schema,
   1184             original_schema,
   1185         )
   1187     # Update the local `table_schema` so mode (NULLABLE/REQUIRED)
   1188     # matches. See: https://github.com/pydata/pandas-gbq/issues/315
   1189     table_schema = pandas_gbq.schema.update_schema(
   1190         table_schema, original_schema
   1191     )

InvalidSchema: Please verify that the structure and data types in the DataFrame match the schema of the destination table.

Xemnas0 avatar Jun 10 '22 14:06 Xemnas0