great_expectations icon indicating copy to clipboard operation
great_expectations copied to clipboard

GCS data store raise exception TooManyRequests

Open viplazylmht opened this issue 2 years ago • 2 comments

Describe the bug When working with GCS bucket to store data docs, sometime I have got an Http Request error: 429 TooManyRequests: rateLimitExceeded.

The problem is GCS allow to update or modify only once per second each object, and the file index.html need to be update multiple times because I use Airflow to schedule tasks.

Expected behavior Follow to this Google API Docs, I want great expectations implement a retry strategy when building docs, at least support GCS first.

Environment:

Traceback

[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -   File "app.py", line 163, in validate
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     data_context.build_data_docs(site_names=build_sites, resource_identifiers=result.list_validation_result_identifiers())
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -   File "/usr/local/lib/python3.7/site-packages/great_expectations/core/usage_statistics/usage_statistics.py", line 287, in usage_statistics_wrapped_method
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     result = func(*args, **kwargs)
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -   File "/usr/local/lib/python3.7/site-packages/great_expectations/data_context/data_context/base_data_context.py", line 2645, in build_data_docs
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     build_index=(build_index and not self.ge_cloud_mode),
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -   File "/usr/local/lib/python3.7/site-packages/great_expectations/render/renderer/site_builder.py", line 313, in build
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     _, index_links_dict = self.site_index_builder.build(build_index=build_index)
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -   File "/usr/local/lib/python3.7/site-packages/great_expectations/render/renderer/site_builder.py", line 769, in build
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     return self.target_store.write_index_page(viewable_content), index_links_dict
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -   File "/usr/local/lib/python3.7/site-packages/great_expectations/data_context/store/html_site_store.py", line 366, in write_index_page
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     content_type="text/html; " "charset=utf-8",
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -   File "/usr/local/lib/python3.7/site-packages/great_expectations/data_context/store/store_backend.py", line 129, in set
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     return self._set(key, value, **kwargs)
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -   File "/usr/local/lib/python3.7/site-packages/great_expectations/data_context/store/tuple_store_backend.py", line 854, in _set
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     value.encode(content_encoding), content_type=content_type
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -   File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2861, in upload_from_string
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     retry=retry,
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -   File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 2592, in upload_from_file
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     _raise_from_invalid_response(exc)
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -   File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 4464, in _raise_from_invalid_response
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     raise exceptions.from_http_status(response.status_code, message, response=response)
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO - google.api_core.exceptions.TooManyRequests: 429 POST https://storage.googleapis.com/upload/storage/v1/b/great_expectations_dev/o?uploadType=multipart: {
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -   "error": {
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     "code": 429,
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     "message": "The rate of change requests to the object great_expectations_dev/great_expectations/data_docs/index.html exceeds the rate limit. Please reduce the rate of create, update, and delete requests.",
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     "errors": [
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -       {
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -         "message": "The rate of change requests to the object great_expectations_dev/great_expectations/data_docs/index.html exceeds the rate limit. Please reduce the rate of create, update, and delete requests.",
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -         "domain": "usageLimits",
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -         "reason": "rateLimitExceeded"
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -       }
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -     ]
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO -   }
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO - }
[2022-07-06, 23:11:40 UTC] {pod_manager.py:226} INFO - : ('Request failed with status code', 429, 'Expected one of', <HTTPStatus.OK: 200>)

viplazylmht avatar Jul 07 '22 02:07 viplazylmht

Hey @viplazylmht ! Thanks for raising this; we'll review internally and be in touch.

austiezr avatar Jul 07 '22 17:07 austiezr

Is this issue still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity.

It will be closed if no further activity occurs. Thank you for your contributions 🙇

github-actions[bot] avatar Aug 07 '22 02:08 github-actions[bot]

@viplazylmht is this still an issue? If not, please close. Thanks!

rdodev avatar Feb 13 '23 20:02 rdodev

Closed

viplazylmht avatar Feb 22 '23 03:02 viplazylmht