cdp-backend
cdp-backend copied to clipboard
403 error on resource_copy
Describe the Bug
When running process-events
job in the Event Gather workflow, there's a 403 error when trying to copy the video file.
Expected Behavior
The video file is publicly accessible (link in the logs below) so I would expect the script to not hit a 403.
Reproduction
This is for a cookie cutter CDP instance for the Legistar chapelhill
client. Link to the failed run logs: https://github.com/triangleblogblog/cdp-ch/actions/runs/8516893019/job/23326788209
Error logs:
[INFO: file_utils: 203 2024-04-02 03:30:10,750] Beginning resource copy from: https://archive-video.granicus.com/chapelhill/chapelhill_bf9e87e2-e776-11ee-98bb-0050569183fa.mp4
/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:1061: InsecureRequestWarning: Unverified HTTPS request is being made to host 'archive-video.granicus.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
warnings.warn(
[ERROR: file_utils: 263 2024-04-02 03:30:10,836] Something went wrong during resource copy. Attempted copy from: 'https://archive-video.granicus.com/chapelhill/chapelhill_bf9e87e2-e776-11ee-98bb-0050569183fa.mp4', resulted in error.
[ERROR: task_runner: 910 2024-04-02 03:30:10,836] Task 'resource_copy_task': Exception encountered during task execution!
Traceback (most recent call last):
File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/prefect/engine/task_runner.py", line 880, in get_task_run_state
value = prefect.utilities.executors.run_task_with_timeout(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/prefect/utilities/executors.py", line 468, in run_task_with_timeout
return task.run(*args, **kwargs) # type: ignore
[2024-04-02 03:30:10+0000] ERROR - prefect.TaskRunner | Task 'resource_copy_task': Exception encountered during task execution!
Traceback (most recent call last):
File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/prefect/engine/task_runner.py", line 880, in get_task_run_state
value = prefect.utilities.executors.run_task_with_timeout(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/prefect/utilities/executors.py", line 468, in run_task_with_timeout
return task.run(*args, **kwargs) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/cdp_backend/pipeline/event_gather_pipeline.py", line 272, in resource_copy_task
return file_utils.resource_copy(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/cdp_backend/utils/file_utils.py", line 267, in resource_copy
raise e
File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/cdp_backend/utils/file_utils.py", line 246, in resource_copy
response.raise_for_status()
File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://archive-video.granicus.com/chapelhill/chapelhill_bf9e87e2-e776-11ee-98bb-0050569183fa.mp4
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/cdp_backend/pipeline/event_gather_pipeline.py", line 272, in resource_copy_task
return file_utils.resource_copy(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/cdp_backend/utils/file_utils.py", line 267, in resource_copy
raise e
File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/cdp_backend/utils/file_utils.py", line 246, in resource_copy
response.raise_for_status()
File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://archive-video.granicus.com/chapelhill/chapelhill_bf9e87e2-e776-11ee-98bb-0050569183fa.mp4
[INFO: task_runner: 335 [202](https://github.com/triangleblogblog/cdp-ch/actions/runs/8516893019/job/23326788209#step:11:203)4-04-02 03:30:10,844] Task 'resource_copy_task': Finished task run for task with final state: 'Retrying'```
### Environment
<!-- Any additional information about your environment. -->
- cdp-backend Version: latest