unstract icon indicating copy to clipboard operation
unstract copied to clipboard

[FEAT] Added support for s3 presigned urls in api deployment

Open pk-zipstack opened this issue 7 months ago • 2 comments

What

  • Added support for presigned s3 URLs in API deployment

Why

  • This was requested by customers.

How

  • Added a function that fetched the files from s3 presigned URLs and adds them to the file_objs. Which then processes the files using the base functionalities.

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

  • Yes, this can break in the part of validating the files since this PR has modifications in Serializers. But this was tested in local and did not throw any issues.

Database Migrations

  • NO

Env Config

  • NO

Relevant Docs

Related Issues or PRs

  • https://zipstack.atlassian.net/browse/UN-2354

Dependencies Versions

Notes on Testing

  • Tested adding single and multiple URLs in testing.

Screenshots

Checklist

I have read and understood the Contribution Guidelines.

pk-zipstack avatar May 19 '25 04:05 pk-zipstack

Please retry analysis of this Pull-Request directly on SonarQube Cloud

sonarqubecloud[bot] avatar May 19 '25 04:05 sonarqubecloud[bot]

Summary by CodeRabbit

  • New Features

    • Support uploading files via presigned URLs in API executions, alongside or instead of direct uploads.
    • Cross-field validation ensures at least one file/URL is provided and respects upload limits.
  • Improvements

    • Clearer errors when fetching files from presigned URLs.
    • Stricter validation for API deployment names, uniqueness, and single active deployment per workflow.
    • Enhanced API key output with richer context.
  • Configuration

    • New setting to limit presigned URL file size (default 20 MB).
  • Chores

    • Removed deprecated connector fields from migrations and tests.

Walkthrough

Adds presigned-URL support to deployment execution: request serializer accepts presigned_urls and allows empty files; API view loads presigned URLs into InMemoryUploadedFile via new DeploymentHelper fetch/load functions (HTTPS/host/size/timeouts enforced, raises PresignedURLFetchError on failure) before executing the workflow. (≤50 words)

Changes

Cohort / File(s) Change Summary
API view update
backend/api_v2/api_deployment_views.py
POST handler reads presigned_urls (default []), treats files as optional, and calls DeploymentHelper.load_presigned_files(presigned_urls, file_objs) before executing the workflow.
Constants
backend/api_v2/constants.py
Added PRESIGNED_URLS = "presigned_urls" to ApiExecution.
Presigned-file helpers
backend/api_v2/deployment_helper.py
Added methods fetch_presigned_file(url) -> InMemoryUploadedFile and load_presigned_files(presigned_urls, file_objs). Streams remote files with timeouts, validates HTTPS/host, enforces per-file size cap (from settings, default 20 MB), derives filename/content-type, appends InMemoryUploadedFile to file_objs, and raises PresignedURLFetchError on errors.
Serializer changes
backend/api_v2/serializers.py
Added optional presigned_urls = ListField(child=URLField(), required=False), made files optional (required=False, allow_empty=True), removed per-field validate_files, and added serializer-level validate enforcing combined count (≥1 attachment and max total). Includes URL validation helpers for presigned URLs.
Exceptions
backend/api_v2/exceptions.py
Added PresignedURLFetchError(APIException) with constructors and classmethods (from_response_error, from_request_exception) to represent/translate fetch failures and status codes.
DB migration query
backend/migrating/v2/query.py
Removed four connector-related columns from tool_instance migration (input/output file/db connector ids) — updated SELECT/INSERT column lists and placeholders.
Test fixtures
unstract/workflow-execution/tests/sample_instances.json
Removed per-instance connector definitions (input_file_connector, output_file_connector, input_db_connector, output_db_connector) from tool instances.
Tests / DTO usage
unstract/workflow-execution/tests/workflow_test.py
Tests updated to stop importing/using ConnectorInstance; ToolInstance constructions no longer include connector-related fields.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant APIView as DeploymentExecution API View
    participant Serializer
    participant Helper as DeploymentHelper
    participant Remote as Presigned Host

    Client->>APIView: POST /deployment-execution (files?, presigned_urls?)
    APIView->>Serializer: validate(request data)
    alt validation fails
        Serializer-->>APIView: ValidationError
        APIView-->>Client: 4xx response
    else
        APIView->>Helper: load_presigned_files(presigned_urls, file_objs)
        alt fetch error (raises PresignedURLFetchError)
            Helper-->>APIView: PresignedURLFetchError
            APIView-->>Client: 4xx/5xx response
        else files loaded
            loop for each presigned URL
                Helper->>Remote: GET presigned_url (streamed, timeout, no redirects)
                Remote-->>Helper: bytes + headers
                Helper-->>Helper: build InMemoryUploadedFile, append to file_objs
            end
            APIView->>APIView: execute_workflow(file_objs, ...)
            APIView-->>Client: Execution response (2xx)
        end
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

✨ Finishing Touches
  • [ ] 📝 Generate Docstrings
🧪 Generate unit tests
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment
  • [ ] Commit unit tests in branch feat/support-s3-presigned-urls-api-deployment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

coderabbitai[bot] avatar Jul 21 '25 11:07 coderabbitai[bot]

filepath function $$\textcolor{#23d18b}{\tt{passed}}$$ SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_logs}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_cleanup}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_client\_init}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_run\_container}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$ $$\textcolor{#23d18b}{\tt{11}}$$ $$\textcolor{#23d18b}{\tt{11}}$$

github-actions[bot] avatar Aug 19 '25 15:08 github-actions[bot]

Quality Gate Failed Quality Gate failed

Failed conditions
15 Security Hotspots

See analysis details on SonarQube Cloud

sonarqubecloud[bot] avatar Aug 20 '25 02:08 sonarqubecloud[bot]