PyAirbyte icon indicating copy to clipboard operation
PyAirbyte copied to clipboard

WIP: resolve bigquery cache datetime format issue

Open aaronsteers opened this issue 6 months ago • 12 comments

Reported in slack:

  • https://airbytehq-team.slack.com/archives/C06FZ238P8W/p1723145063648579?thread_ts=1723118180.084029&cid=C06FZ238P8W

Log from repro in new "example" script:

google.api_core.exceptions.BadRequest: 400 Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details.; reason: invalid, message: Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details.; reason: invalid, message: Error while reading data, error message: JSON processing encountered too many errors, giving up. Rows: 1; errors: 1; max bad: 0; error percent: 0; reason: invalid, message: Error while reading data, error message: JSON parsing error in row starting at position 0: Couldn't convert value to timestamp: Could not parse '2020-10-22T21:03:23.000+0000' as a timestamp. Required format is YYYY-MM-DD HH:MM[:SS[.SSSSSS]] or YYYY/MM/DD HH:MM[:SS[.SSSSSS]] Field: createddate; Value: 2020-10-22T21:03:23.000+0000

BigQuery expects a colon in timezone part: +00:00 instead of +0000.

Summary by CodeRabbit

  • New Features
    • Introduced a script for transferring data securely from Salesforce to Google BigQuery using the Airbyte framework.
    • Added functionality for caching data during the transfer process, ensuring smooth data migration.
    • Enhanced reporting with stream names and record counts for better data visibility.
    • Added a new script specifically for facilitating data migration from Salesforce to a BigQuery destination.

aaronsteers avatar Aug 09 '24 02:08 aaronsteers