airbyte icon indicating copy to clipboard operation
airbyte copied to clipboard

exposes row batch size config to the connector

Open akulgoel96 opened this issue 3 years ago • 1 comments
trafficstars

What

Currently, the ROW_BATCH_SIZE value is a constant defined in the code and in our case it was leading to a lot of rate limit exceeded errors while reading spreadsheets having 100,000+ rows.

How

This solution exposes the above variable to the connector, while keeping the default value as 200 itself to maintain backwards compatibility.

🚨 User Impact 🚨

As mentioned above, it is backwards compatible since the default value has been kept the same. So, there should not be any user impact.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

Updating a connector

Community member or Airbyter

  • [x] Grant edit access to maintainers (instructions)
  • [x] Secrets in the connector's spec are annotated with airbyte_secret
  • [ ] Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • [ ] Code reviews completed
  • [ ] Documentation updated
    • [x] Connector's README.md
    • [ ] Connector's bootstrap.md. See description and examples
    • [ ] Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • [ ] PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • [ ] Create a non-forked branch based on this PR and test the below items on it
  • [ ] Build is successful
  • [ ] If new credentials are required for use in CI, add them to GSM. Instructions.
  • [ ] /test connector=connectors/<name> command is passing
  • [ ] New Connector version released on Dockerhub and connector version bumped by running the /publish command described here

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

akulgoel96 avatar Jul 28 '22 14:07 akulgoel96

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Jul 28 '22 14:07 CLAassistant

/test connector=connectors/source-google-sheets

:clock2: connectors/source-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2776523571 :x: connectors/source-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2776523571 :bug:

sajarin avatar Aug 01 '22 17:08 sajarin

@misteryeo can you check this change in the specification of Source Google Sheets?

marcosmarxm avatar Aug 01 '22 19:08 marcosmarxm

There isn;t any docs in Google saying about a better row size?

marcosmarxm avatar Aug 01 '22 19:08 marcosmarxm

/test connector=connectors/source-google-sheets

:clock2: connectors/source-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2791172104 :white_check_mark: connectors/source-google-sheets https://github.com/airbytehq/airbyte/actions/runs/2791172104 Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        81      6    93%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/tests/test_incremental.py       121     25    79%
source_acceptance_test/utils/common.py                  77     17    78%
source_acceptance_test/tests/test_core.py              328    121    63%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
------------------------------------------------------------------------
TOTAL                                                  995    261    74%
Name                                                Stmts   Miss  Cover
-----------------------------------------------------------------------
google_sheets_source/models/spreadsheet_values.py      12      0   100%
google_sheets_source/models/spreadsheet.py             34      0   100%
google_sheets_source/models/__init__.py                 2      0   100%
google_sheets_source/__init__.py                        2      0   100%
google_sheets_source/helpers.py                       139     26    81%
google_sheets_source/client.py                         23      6    74%
google_sheets_source/google_sheets_source.py          107     85    21%
-----------------------------------------------------------------------
TOTAL                                                 319    117    63%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config
============= 27 passed, 1 skipped, 1 warning in 123.34s (0:02:03) =============

marcosmarxm avatar Aug 03 '22 17:08 marcosmarxm

/publish connector=connectors/source-google-sheets

:clock2: Publishing the following connectors:
connectors/source-google-sheets
https://github.com/airbytehq/airbyte/actions/runs/2791253667

Connector Did it publish? Were definitions generated?
connectors/source-google-sheets :x: :x:

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

marcosmarxm avatar Aug 03 '22 17:08 marcosmarxm

/publish connector=connectors/source-google-sheets

:clock2: Publishing the following connectors:
connectors/source-google-sheets
https://github.com/airbytehq/airbyte/actions/runs/2791363235

Connector Did it publish? Were definitions generated?
connectors/source-google-sheets :white_check_mark: :white_check_mark:

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

marcosmarxm avatar Aug 03 '22 17:08 marcosmarxm