scratchdata icon indicating copy to clipboard operation
scratchdata copied to clipboard

Coerce BigQuery types when inserting

Open poundifdef opened this issue 11 months ago • 5 comments

The following inserts into bigquery will fail:

{"x": "y"}
{"x": true}
googleapi: Error 400: Invalid schema update. Field x has changed type from STRING to BOOLEAN, invalid

This is because when the table is first created, column x is a string. When the second JSON is inserted, x is a bool. We want to try and coerce data to match the column type if possible, otherwise use a null value. This will prevent inserts from failing outright.

BigQuery lists conversion rules here which may be helpful: https://cloud.google.com/bigquery/docs/reference/standard-sql/conversion_rules

  • [ ] Create new configuration option big BigQuery: strict_types bool. If it is set to true, then do not do any type conversion. If it is false, then do some sort of type conversion (either in SQL or on the go side) when inserting data.

poundifdef avatar Mar 20 '24 13:03 poundifdef

Hey @poundifdef is it okay if I look this over the weekend?

mohanish2504 avatar Mar 20 '24 16:03 mohanish2504

Hey @poundifdef is it okay if I look this over the weekend?

Feel free to make a PR. Just a heads-up, there is not a bounty on this one. (If you're looking for a bounty, #126 is available.)

poundifdef avatar Mar 20 '24 16:03 poundifdef

I am aware of it, I will look into this over weekend

mohanish2504 avatar Mar 20 '24 16:03 mohanish2504

Not sure how this will be handled during LOAD function.

Need to check each column being entered, if doesnot match the value then do conversion on go itself?

Does this logic looks good?

mohanish2504 avatar Mar 23 '24 12:03 mohanish2504

Could you CAST data in sql?

poundifdef avatar Mar 23 '24 13:03 poundifdef