airbyte icon indicating copy to clipboard operation
airbyte copied to clipboard

🐛 Source Amazon Ads: Improve report streams date-range generation

Open grubberr opened this issue 2 years ago • 6 comments

Signed-off-by: Sergey Chvalyuk [email protected]

What

  1. For reporting streams now stream_slices generates separate slice for each profile because every profile calculate statistic in it's own timezone. We cannot mix them together. For example pendulum.now() for one profile can be "reportDate": "20210731", for another profile: "reportDate": "20210801".

  2. SponsoredProductsReportStream added new field updatedAt which is part of primary_key:

    - primary_key = ["profileId", "recordType", "reportDate"]
    + primary_key = ["profileId", "recordType", "reportDate", "updatedAt"]
    
  3. Now we use pretty complex algorithm to find report_start_date. We have a lot of parameters: start_date from config, last_sync_date from state, today, LOOK_BACK_WINDOW=3 days, 60 days to request old reports.
    We need to take everything into account and find the right report_start_date.

  4. Remove any reference to "SANDBOX" environment because of this: https://advertising.amazon.com/API/docs/en-us/info/release-notes#sandbox-deprecation-on-june-28-2022

How

Describe the solution

Recommended reading order

  1. x.java
  2. y.python

🚨 User Impact 🚨

Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

Updating a connector

Community member or Airbyter

  • [x] Grant edit access to maintainers (instructions)
  • [x] Secrets in the connector's spec are annotated with airbyte_secret
  • [x] Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • [ ] Code reviews completed
  • [ ] Documentation updated
    • [ ] Connector's README.md
    • [ ] Connector's bootstrap.md. See description and examples
    • [x] Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • [x] PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • [x] Create a non-forked branch based on this PR and test the below items on it
  • [x] Build is successful
  • [x] If new credentials are required for use in CI, add them to GSM. Instructions.
  • [x] /test connector=connectors/<name> command is passing
  • [ ] New Connector version released on Dockerhub and connector version bumped by running the /publish command described here

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

grubberr avatar Jul 26 '22 05:07 grubberr

/test connector=connectors/source-amazon-ads

:clock2: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2752904185 :x: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2752904185 :bug: https://gradle.com/s/bxc35nkyv3gry

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_core.py::TestBasicRead::test_read[inputs0] - docker.errors.Contai...
FAILED test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs0]
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config
============= 2 failed, 22 passed, 1 skipped in 3442.66s (0:57:22) =============

grubberr avatar Jul 28 '22 09:07 grubberr

/test connector=connectors/source-amazon-ads

grubberr avatar Jul 28 '22 17:07 grubberr

/test connector=connectors/source-amazon-ads

grubberr avatar Jul 28 '22 17:07 grubberr

/test connector=connectors/source-amazon-ads

:clock2: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2755740785 :x: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2755740785 :bug: https://gradle.com/s/aqiflyutcfrdo

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs1]
ERROR test_incremental.py::TestIncremental::test_state_with_abnormally_large_values[inputs0]
============== 1 failed, 28 passed, 1 error in 3226.60s (0:53:46) ==============

grubberr avatar Jul 28 '22 17:07 grubberr

/test connector=connectors/source-amazon-ads

:clock2: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2756122726 :white_check_mark: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2756122726 Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        77      6    92%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/tests/test_incremental.py       121     25    79%
source_acceptance_test/utils/common.py                  77     17    78%
source_acceptance_test/tests/test_core.py              307    106    65%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
------------------------------------------------------------------------
TOTAL                                                  970    246    75%
Name                                                              Stmts   Miss  Cover
-------------------------------------------------------------------------------------
source_amazon_ads/streams/sponsored_products.py                      32      0   100%
source_amazon_ads/streams/sponsored_display.py                       22      0   100%
source_amazon_ads/streams/sponsored_brands.py                        17      0   100%
source_amazon_ads/streams/report_streams/products_report.py          19      0   100%
source_amazon_ads/streams/report_streams/display_report.py           16      0   100%
source_amazon_ads/streams/report_streams/brands_video_report.py      10      0   100%
source_amazon_ads/streams/report_streams/brands_report.py            10      0   100%
source_amazon_ads/streams/report_streams/__init__.py                  5      0   100%
source_amazon_ads/streams/profiles.py                                21      0   100%
source_amazon_ads/streams/__init__.py                                 6      0   100%
source_amazon_ads/schemas/sponsored_products.py                      37      0   100%
source_amazon_ads/schemas/sponsored_display.py                       31      0   100%
source_amazon_ads/schemas/sponsored_brands.py                        22      0   100%
source_amazon_ads/schemas/profile.py                                 16      0   100%
source_amazon_ads/schemas/__init__.py                                 6      0   100%
source_amazon_ads/constants.py                                        6      0   100%
source_amazon_ads/__init__.py                                         2      0   100%
source_amazon_ads/streams/common.py                                  76      1    99%
source_amazon_ads/schemas/common.py                                  51      1    98%
source_amazon_ads/source.py                                          34      1    97%
source_amazon_ads/streams/report_streams/report_streams.py          220     14    94%
-------------------------------------------------------------------------------------
TOTAL                                                               659     17    97%

Build Passed

Test summary info:

All Passed

grubberr avatar Jul 28 '22 19:07 grubberr

/test connector=connectors/source-amazon-ads

:clock2: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2760246830 :white_check_mark: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2760246830 Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        77      6    92%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/tests/test_incremental.py       121     25    79%
source_acceptance_test/utils/common.py                  77     17    78%
source_acceptance_test/tests/test_core.py              307    106    65%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
------------------------------------------------------------------------
TOTAL                                                  970    246    75%
Name                                                              Stmts   Miss  Cover
-------------------------------------------------------------------------------------
source_amazon_ads/streams/sponsored_products.py                      32      0   100%
source_amazon_ads/streams/sponsored_display.py                       22      0   100%
source_amazon_ads/streams/sponsored_brands.py                        17      0   100%
source_amazon_ads/streams/report_streams/products_report.py          19      0   100%
source_amazon_ads/streams/report_streams/display_report.py           16      0   100%
source_amazon_ads/streams/report_streams/brands_video_report.py      10      0   100%
source_amazon_ads/streams/report_streams/brands_report.py            10      0   100%
source_amazon_ads/streams/report_streams/__init__.py                  5      0   100%
source_amazon_ads/streams/profiles.py                                21      0   100%
source_amazon_ads/streams/__init__.py                                 6      0   100%
source_amazon_ads/schemas/sponsored_products.py                      37      0   100%
source_amazon_ads/schemas/sponsored_display.py                       31      0   100%
source_amazon_ads/schemas/sponsored_brands.py                        22      0   100%
source_amazon_ads/schemas/profile.py                                 16      0   100%
source_amazon_ads/schemas/__init__.py                                 6      0   100%
source_amazon_ads/constants.py                                        6      0   100%
source_amazon_ads/__init__.py                                         2      0   100%
source_amazon_ads/streams/common.py                                  76      1    99%
source_amazon_ads/schemas/common.py                                  51      1    98%
source_amazon_ads/source.py                                          34      1    97%
source_amazon_ads/streams/report_streams/report_streams.py          220     14    94%
-------------------------------------------------------------------------------------
TOTAL                                                               659     17    97%

Build Passed

Test summary info:

All Passed

grubberr avatar Jul 29 '22 11:07 grubberr

@misteryeo @YowanR Can you please review documentation changes amazon-ads.md ? I am preparing amazon-ads to GA and I have changes in documentation.

grubberr avatar Aug 01 '22 18:08 grubberr

/test connector=connectors/source-amazon-ads

:clock2: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2776932640 :x: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2776932640 :bug: https://gradle.com/s/tnls6sszvdeem

Build Failed

Test summary info:

	 =========================== short test summary info ============================
	 FAILED unit_tests/test_report_streams.py::test_display_report_stream_init_http_exception
	 FAILED unit_tests/test_report_streams.py::test_display_report_stream_init_too_many_requests
	 [31m================== [31m[1m2 failed[0m, [32m39 passed[0m, [33m699 warnings[0m[31m in 2.62s[0m[31m ==================[0m

grubberr avatar Aug 01 '22 18:08 grubberr

/test connector=connectors/source-amazon-ads

:clock2: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2781879394 :x: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2781879394 :bug: https://gradle.com/s/xgj4b7znpnbmo

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs1]
============= 1 failed, 30 passed, 1 warning in 3765.00s (1:02:45) =============

grubberr avatar Aug 02 '22 11:08 grubberr

/test connector=connectors/source-amazon-ads

:clock2: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2782307726 :x: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2782307726 :bug: https://gradle.com/s/au5ihiwwd7dmc

Build Failed

Test summary info:

Could not find result summary

grubberr avatar Aug 02 '22 12:08 grubberr

/test connector=connectors/source-amazon-ads

:clock2: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2783025309 :white_check_mark: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2783025309 Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        81      6    93%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/tests/test_incremental.py       121     25    79%
source_acceptance_test/utils/common.py                  77     17    78%
source_acceptance_test/tests/test_core.py              328    121    63%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
------------------------------------------------------------------------
TOTAL                                                  995    261    74%
Name                                                              Stmts   Miss  Cover
-------------------------------------------------------------------------------------
source_amazon_ads/streams/sponsored_products.py                      32      0   100%
source_amazon_ads/streams/sponsored_display.py                       22      0   100%
source_amazon_ads/streams/sponsored_brands.py                        17      0   100%
source_amazon_ads/streams/report_streams/products_report.py          19      0   100%
source_amazon_ads/streams/report_streams/display_report.py           16      0   100%
source_amazon_ads/streams/report_streams/brands_video_report.py      10      0   100%
source_amazon_ads/streams/report_streams/brands_report.py            10      0   100%
source_amazon_ads/streams/report_streams/__init__.py                  5      0   100%
source_amazon_ads/streams/profiles.py                                21      0   100%
source_amazon_ads/streams/__init__.py                                 6      0   100%
source_amazon_ads/schemas/sponsored_products.py                      37      0   100%
source_amazon_ads/schemas/sponsored_display.py                       31      0   100%
source_amazon_ads/schemas/sponsored_brands.py                        22      0   100%
source_amazon_ads/schemas/profile.py                                 16      0   100%
source_amazon_ads/schemas/__init__.py                                 6      0   100%
source_amazon_ads/constants.py                                        6      0   100%
source_amazon_ads/__init__.py                                         2      0   100%
source_amazon_ads/streams/common.py                                  76      1    99%
source_amazon_ads/schemas/common.py                                  51      1    98%
source_amazon_ads/source.py                                          34      1    97%
source_amazon_ads/streams/report_streams/report_streams.py          222     16    93%
-------------------------------------------------------------------------------------
TOTAL                                                               661     19    97%

Build Passed

Test summary info:

All Passed

grubberr avatar Aug 02 '22 14:08 grubberr

/test connector=connectors/source-amazon-ads

:clock2: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2799537546 :white_check_mark: connectors/source-amazon-ads https://github.com/airbytehq/airbyte/actions/runs/2799537546 Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        81      6    93%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/tests/test_incremental.py       121     25    79%
source_acceptance_test/utils/common.py                  77     17    78%
source_acceptance_test/tests/test_core.py              328    121    63%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
------------------------------------------------------------------------
TOTAL                                                  995    261    74%
Name                                                              Stmts   Miss  Cover
-------------------------------------------------------------------------------------
source_amazon_ads/streams/sponsored_products.py                      32      0   100%
source_amazon_ads/streams/sponsored_display.py                       22      0   100%
source_amazon_ads/streams/sponsored_brands.py                        17      0   100%
source_amazon_ads/streams/report_streams/products_report.py          19      0   100%
source_amazon_ads/streams/report_streams/display_report.py           16      0   100%
source_amazon_ads/streams/report_streams/brands_video_report.py      10      0   100%
source_amazon_ads/streams/report_streams/brands_report.py            10      0   100%
source_amazon_ads/streams/report_streams/__init__.py                  5      0   100%
source_amazon_ads/streams/profiles.py                                21      0   100%
source_amazon_ads/streams/__init__.py                                 6      0   100%
source_amazon_ads/schemas/sponsored_products.py                      37      0   100%
source_amazon_ads/schemas/sponsored_display.py                       31      0   100%
source_amazon_ads/schemas/sponsored_brands.py                        22      0   100%
source_amazon_ads/schemas/profile.py                                 16      0   100%
source_amazon_ads/schemas/__init__.py                                 6      0   100%
source_amazon_ads/constants.py                                        6      0   100%
source_amazon_ads/__init__.py                                         2      0   100%
source_amazon_ads/streams/common.py                                  76      1    99%
source_amazon_ads/schemas/common.py                                  51      1    98%
source_amazon_ads/source.py                                          34      1    97%
source_amazon_ads/streams/report_streams/report_streams.py          209     20    90%
-------------------------------------------------------------------------------------
TOTAL                                                               648     23    96%

Build Passed

Test summary info:

All Passed

grubberr avatar Aug 04 '22 20:08 grubberr

@girarda can you please review again? I have re-implemented general idea

now get_start_date pretty simple logic but now more complex logic inside get_updated_state

grubberr avatar Aug 04 '22 20:08 grubberr

/publish connector=connectors/source-amazon-ads

:clock2: Publishing the following connectors:
connectors/source-amazon-ads
https://github.com/airbytehq/airbyte/actions/runs/2801702502

Connector Did it publish? Were definitions generated?
connectors/source-amazon-ads :white_check_mark: :white_check_mark:

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

grubberr avatar Aug 05 '22 06:08 grubberr

Hey @grubberr ,

Could you help me to understand the difference between Start Date and Lookback window?

From reading the comments here, it sounds like we use the Lookback window first (data from today - Lookback window will be generated) and if the Lookback window is blank, then we use the Start Date (if defined). Is that the correct interpretation?

nataliekwong avatar May 30 '23 23:05 nataliekwong

@nataliekwong

  1. About Lookback window - If the customer has not specified lookback in the config, lookback = 3 will be used as default value.
  2. look back only applied on 2-nd and next syncs. On the first sync lookback is not applied. The first sync always started with Start Date config option.

grubberr avatar May 31 '23 11:05 grubberr

@grubberr Does the Lookback window use the same logic to filter the results as the Start date in that it is used to determine how far back the data will be synced from? Could you verify the examples here would act in the way you've described?

Example 1: Settings: Start Date is blank (default value) & Lookback window is 3 (default value) First sync: Today's data is synced Subsequent syncs: Data from the sync date - 3 days is synced

Example 2: Settings: Start Date is April 15, 2023 & Lookback window is 3 (default value) First sync: Data from April 15, 2023 is synced Subsequent syncs: Data from the sync date - 3 days is synced

Example 3: Settings: Start Date is blank (default value) & Lookback window is 100 First sync: Today's data is synced Subsequent syncs: Data from the sync date - 100 days is synced

Example 3: Settings: Start Date is April 15, 2023 & Lookback window is 100 First sync: Data from April 15, 2023 is synced Subsequent syncs: Data from the sync date - 100 days is synced

nataliekwong avatar May 31 '23 18:05 nataliekwong

@grubberr Could you confirm my understanding is correct above?

nataliekwong avatar Jun 05 '23 22:06 nataliekwong

@grubberr I see some detail in the Stripe docs about the Start Date and Lookback window. Is the behavior similar for this connector?

nataliekwong avatar Jun 12 '23 14:06 nataliekwong