airbyte
airbyte copied to clipboard
exposes row batch size config to the connector
What
Currently, the ROW_BATCH_SIZE value is a constant defined in the code and in our case it was leading to a lot of rate limit exceeded errors while reading spreadsheets having 100,000+ rows.
How
This solution exposes the above variable to the connector, while keeping the default value as 200 itself to maintain backwards compatibility.
🚨 User Impact 🚨
As mentioned above, it is backwards compatible since the default value has been kept the same. So, there should not be any user impact.
Pre-merge Checklist
Expand the relevant checklist and delete the others.
Updating a connector
Community member or Airbyter
- [x] Grant edit access to maintainers (instructions)
- [x] Secrets in the connector's spec are annotated with
airbyte_secret - [ ] Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run
./gradlew :airbyte-integrations:connectors:<name>:integrationTest. - [ ] Code reviews completed
- [ ] Documentation updated
- [x] Connector's
README.md - [ ] Connector's
bootstrap.md. See description and examples - [ ] Changelog updated in
docs/integrations/<source or destination>/<name>.mdincluding changelog. See changelog example
- [x] Connector's
- [ ] PR name follows PR naming conventions
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
- [ ] Create a non-forked branch based on this PR and test the below items on it
- [ ] Build is successful
- [ ] If new credentials are required for use in CI, add them to GSM. Instructions.
- [ ]
/test connector=connectors/<name>command is passing - [ ] New Connector version released on Dockerhub and connector version bumped by running the
/publishcommand described here
Tests
Unit
Put your unit tests output here.
Integration
Put your integration tests output here.
Acceptance
Put your acceptance tests output here.
/test connector=connectors/source-google-sheets
:clock2: connectors/source-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2776523571 :x: connectors/source-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2776523571 :bug:
@misteryeo can you check this change in the specification of Source Google Sheets?
There isn;t any docs in Google saying about a better row size?
/test connector=connectors/source-google-sheets
:clock2: connectors/source-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2791172104 :white_check_mark: connectors/source-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2791172104 Python tests coverage:
Name Stmts Miss Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py 6 0 100%
source_acceptance_test/tests/__init__.py 4 0 100%
source_acceptance_test/__init__.py 2 0 100%
source_acceptance_test/tests/test_full_refresh.py 52 2 96%
source_acceptance_test/utils/asserts.py 37 2 95%
source_acceptance_test/config.py 81 6 93%
source_acceptance_test/utils/json_schema_helper.py 105 13 88%
source_acceptance_test/tests/test_incremental.py 121 25 79%
source_acceptance_test/utils/common.py 77 17 78%
source_acceptance_test/tests/test_core.py 328 121 63%
source_acceptance_test/utils/compare.py 62 23 63%
source_acceptance_test/base.py 10 4 60%
source_acceptance_test/utils/connector_runner.py 110 48 56%
------------------------------------------------------------------------
TOTAL 995 261 74%
Name Stmts Miss Cover
-----------------------------------------------------------------------
google_sheets_source/models/spreadsheet_values.py 12 0 100%
google_sheets_source/models/spreadsheet.py 34 0 100%
google_sheets_source/models/__init__.py 2 0 100%
google_sheets_source/__init__.py 2 0 100%
google_sheets_source/helpers.py 139 26 81%
google_sheets_source/client.py 23 6 74%
google_sheets_source/google_sheets_source.py 107 85 21%
-----------------------------------------------------------------------
TOTAL 319 117 63%
Build Passed
Test summary info:
=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config
============= 27 passed, 1 skipped, 1 warning in 123.34s (0:02:03) =============
/publish connector=connectors/source-google-sheets
:clock2: Publishing the following connectors:
connectors/source-google-sheets
https://github.com/airbytehq/airbyte/actions/runs/2791253667
| Connector | Did it publish? | Were definitions generated? |
|---|---|---|
| connectors/source-google-sheets | :x: | :x: |
if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️
/publish connector=connectors/source-google-sheets
:clock2: Publishing the following connectors:
connectors/source-google-sheets
https://github.com/airbytehq/airbyte/actions/runs/2791363235
| Connector | Did it publish? | Were definitions generated? |
|---|---|---|
| connectors/source-google-sheets | :white_check_mark: | :white_check_mark: |
if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️