airbyte icon indicating copy to clipboard operation
airbyte copied to clipboard

[source-google-sheets] silently fails when reaching requests limit

Open henriquemeloo opened this issue 1 year ago • 10 comments

Connector Name

source-google-sheets

Connector Version

0.4.0

What step the error happened?

During the sync

Relevant information

When the connector reaches rate limits and cannot continue the sync, it exits as success, so you have a successful job and incorrect data in the destination.

Relevant log output

2024-03-28 21:07:54 source > Fetching range digital_media!38857:39177
2024-03-28 21:07:54 source > Backing off get_values(...) for 0.8s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:07:54 source > Increasing number of records fetching due to rate limits. Current value: 330
2024-03-28 21:07:55 source > Fetching range digital_media!38857:39187
2024-03-28 21:07:55 source > Backing off get_values(...) for 1.7s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:07:55 source > Increasing number of records fetching due to rate limits. Current value: 340
2024-03-28 21:07:56 source > Fetching range digital_media!38857:39197
2024-03-28 21:07:56 source > Backing off get_values(...) for 0.8s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:07:56 source > Increasing number of records fetching due to rate limits. Current value: 350
2024-03-28 21:07:57 source > Fetching range digital_media!38857:39207
2024-03-28 21:07:57 source > Backing off get_values(...) for 2.1s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:07:57 source > Increasing number of records fetching due to rate limits. Current value: 360
2024-03-28 21:07:59 source > Fetching range digital_media!38857:39217
2024-03-28 21:07:59 source > Backing off get_values(...) for 5.9s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:07:59 source > Increasing number of records fetching due to rate limits. Current value: 370
2024-03-28 21:08:05 source > Fetching range digital_media!38857:39227
2024-03-28 21:08:05 source > Backing off get_values(...) for 5.4s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:08:05 source > Increasing number of records fetching due to rate limits. Current value: 380
2024-03-28 21:08:11 source > Fetching range digital_media!38857:39237
2024-03-28 21:08:11 source > Backing off get_values(...) for 2.0s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:08:11 source > Increasing number of records fetching due to rate limits. Current value: 390
2024-03-28 21:08:13 source > Fetching range digital_media!38857:39247
2024-03-28 21:08:13 source > Backing off get_values(...) for 100.8s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:08:13 source > Increasing number of records fetching due to rate limits. Current value: 400
2024-03-28 21:09:54 source > Fetching range digital_media!38857:39257
2024-03-28 21:09:54 source > Giving up get_values(...) after 9 tries (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:09:54 source > Stopped syncing process due to rate limits. Rate limit has been reached. Please try later or request a higher quota for your account.
2024-03-28 21:09:54 source > Finished syncing spreadsheet {spreadsheet_id}

Contribute

  • [ ] Yes, I want to contribute

henriquemeloo avatar Apr 08 '24 19:04 henriquemeloo

Here: https://github.com/airbytehq/airbyte/blob/5ea2b93c38ee8aaf5e4a161a82b4d6ee5513ac1c/airbyte-integrations/connectors/source-google-sheets/source_google_sheets/source.py#L227 we can see that this exception is logged and not re-raised, as it maybe should be, so that the sync would exit as failed.

henriquemeloo avatar Apr 08 '24 19:04 henriquemeloo

We're rather actively looking into cases like this, and have a specific new error type for these — it definitely should not be just swallowed. Thank you for raising this!

natikgadzhi avatar Apr 09 '24 05:04 natikgadzhi

/cc @maxi297 and @ChristoGrab, and @brianjlai — perhaps google sheets could be a good candidate for a cleanup once we release error handlers in low-code CDK and Python CDK sane defaults.

natikgadzhi avatar Apr 09 '24 05:04 natikgadzhi

@natikgadzhi maybe https://github.com/airbytehq/airbyte/pull/35404 could help?

marcosmarxm avatar Apr 10 '24 11:04 marcosmarxm

@natikgadzhi maybe #35404 could help?

It could help, but does not guarantee we won't still be swallowing these errors when they happen, which is, in my opinion, a critical problem, since this is leading to (very) incomplete syncs showing up as successful.

henriquemeloo avatar Apr 10 '24 17:04 henriquemeloo

Yeah, if we're silently giving up but not raising the transient error to the platform, it's bad. We just introduced transient (rate limit) errors in the protocol, and @maxi297 and @katmarkham's team are looking in implementing them and cleaning up the highest usage connectors, which this is one of.

So — we will get to this rather quickly. @henriquemeloo, if we gave you an example of a fix in another connector, would you be interested in trying to put together a PR?

natikgadzhi avatar Apr 10 '24 23:04 natikgadzhi

@henriquemeloo can you try latest version of Google Sheets connector. It has the batch size, maybe you can increase the value to not hit the 429 error

marcosmarxm avatar Apr 11 '24 15:04 marcosmarxm

@natikgadzhi yes. I could try to implement it. I'd appreciate if you could send the example.

henriquemeloo avatar Apr 11 '24 21:04 henriquemeloo

@strosek, @ChristoGrab, do you happen to have an example of error improvements work that you're doing now to highlight how to exit with non-zero exit code and send an error message for transient errors?

natikgadzhi avatar Apr 19 '24 00:04 natikgadzhi

@natikgadzhi @strosek @ChristoGrab do you have an example of how this is being handled so I could try to contribute with this fix?

henriquemeloo avatar May 16 '24 17:05 henriquemeloo