cdp-backend icon indicating copy to clipboard operation
cdp-backend copied to clipboard

403 error on resource_copy

Open dvdokkum opened this issue 10 months ago • 5 comments

Describe the Bug

When running process-events job in the Event Gather workflow, there's a 403 error when trying to copy the video file.

Expected Behavior

The video file is publicly accessible (link in the logs below) so I would expect the script to not hit a 403.

Reproduction

This is for a cookie cutter CDP instance for the Legistar chapelhill client. Link to the failed run logs: https://github.com/triangleblogblog/cdp-ch/actions/runs/8516893019/job/23326788209

Error logs:

[INFO: file_utils: 203 2024-04-02 03:30:10,750] Beginning resource copy from: https://archive-video.granicus.com/chapelhill/chapelhill_bf9e87e2-e776-11ee-98bb-0050569183fa.mp4
/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:1061: InsecureRequestWarning: Unverified HTTPS request is being made to host 'archive-video.granicus.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(
[ERROR: file_utils: 263 2024-04-02 03:30:10,836] Something went wrong during resource copy. Attempted copy from: 'https://archive-video.granicus.com/chapelhill/chapelhill_bf9e87e2-e776-11ee-98bb-0050569183fa.mp4', resulted in error.
[ERROR: task_runner: 910 2024-04-02 03:30:10,836] Task 'resource_copy_task': Exception encountered during task execution!
Traceback (most recent call last):
  File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/prefect/engine/task_runner.py", line 880, in get_task_run_state
    value = prefect.utilities.executors.run_task_with_timeout(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/prefect/utilities/executors.py", line 468, in run_task_with_timeout
    return task.run(*args, **kwargs)  # type: ignore
[2024-04-02 03:30:10+0000] ERROR - prefect.TaskRunner | Task 'resource_copy_task': Exception encountered during task execution!
Traceback (most recent call last):
  File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/prefect/engine/task_runner.py", line 880, in get_task_run_state
    value = prefect.utilities.executors.run_task_with_timeout(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/prefect/utilities/executors.py", line 468, in run_task_with_timeout
    return task.run(*args, **kwargs)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/cdp_backend/pipeline/event_gather_pipeline.py", line 272, in resource_copy_task
    return file_utils.resource_copy(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/cdp_backend/utils/file_utils.py", line 267, in resource_copy
    raise e
  File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/cdp_backend/utils/file_utils.py", line 246, in resource_copy
    response.raise_for_status()
  File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://archive-video.granicus.com/chapelhill/chapelhill_bf9e87e2-e776-11ee-98bb-0050569183fa.mp4
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/cdp_backend/pipeline/event_gather_pipeline.py", line 272, in resource_copy_task
    return file_utils.resource_copy(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/cdp_backend/utils/file_utils.py", line 267, in resource_copy
    raise e
  File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/cdp_backend/utils/file_utils.py", line 246, in resource_copy
    response.raise_for_status()
  File "/__w/_tool/Python/3.11.8/x64/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://archive-video.granicus.com/chapelhill/chapelhill_bf9e87e2-e776-11ee-98bb-0050569183fa.mp4
[INFO: task_runner: 335 [202](https://github.com/triangleblogblog/cdp-ch/actions/runs/8516893019/job/23326788209#step:11:203)4-04-02 03:30:10,844] Task 'resource_copy_task': Finished task run for task with final state: 'Retrying'```

### Environment

<!-- Any additional information about your environment. -->

-   cdp-backend Version: latest

dvdokkum avatar Apr 02 '24 04:04 dvdokkum