unity-builder icon indicating copy to clipboard operation
unity-builder copied to clipboard

Cloud Runner Improvements - S3 Locking, Aws Local Stack (Pipelines), Testing Improvements, Rclone storage support, Provider plugin system

Open frostebite opened this issue 4 months ago • 3 comments

Changes

  • New Features
    • Support for LocalStack (local AWS emulator) and service-specific AWS endpoints.
    • Experimental rclone storage provider with built-in cache/build hooks.
    • Windows execution path for local runs; Kubernetes endpoint normalization.
    • Experimental plugin system for loading cloud runner provider from executable file or git repository
  • Improvements
    • More robust AWS interactions (retry/backoff), logging, caching, and repo checkout.
    • Container-aware workflow behavior and consistent environment handling.
  • Documentation
    • Added guides for LocalStack usage and rclone storage.
  • Chores
    • New integrity workflows; removed legacy Cloud Runner pipeline.
    • CI uses dedicated Jest config and script.
    • Lint configuration update.

Related Issues

  • ...

Related PRs

  • ...

Successful Workflow Run Link

PRs don't have access to secrets so you will need to provide a link to a successful run of the workflows from your own repo.

  • ...

Checklist

  • [x] Read the contribution guide and accept the code of conduct
  • [ ] Docs (If new inputs or outputs have been added or changes to behavior that should be documented. Please make a PR in the documentation repo)
  • [ ] Readme (updated or not needed)
  • [ ] Tests (added, updated or not needed)

Summary by CodeRabbit

  • New Features

    • Plugin-style provider loading with dynamic providers, local caching, and improved storage/cache hooks (rclone support); Docker/Windows host handling enhancements.
  • Bug Fixes / Reliability

    • LocalStack-aware routing, retry/backoff for cloud calls, backend-agnostic workspace locking, proactive disk-space safeguards, improved pod diagnostics, and more robust multi-channel log capture.
  • Tests

    • CI-focused Jest config with fetch polyfill, environment-gated unit/e2e tests, and explicit BuildSucceeded boolean in run results.
  • Documentation

    • Provider loader README added.
  • Chores

    • CI/workflow reorganization, TypeScript include update, new CI script and dev deps.

✏️ Tip: You can customize this high-level summary in your review settings.

frostebite avatar Sep 08 '25 14:09 frostebite

[!NOTE]

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Add dynamic provider loading (local/GitHub/npm) with caching and validation; centralize AWS client creation and endpoint overrides with LocalStack detection; migrate AWS usages to a factory with retries; replace filesystem locking with S3/rclone backends; harden runners, hooks, tests, and CI/Jest workflows.

Changes

Cohort / File(s) Summary
Provider loader & parser
src/model/cloud-runner/providers/provider-loader.ts, src/model/cloud-runner/providers/provider-url-parser.ts, src/model/cloud-runner/providers/provider-git-manager.ts, src/model/cloud-runner/providers/README.md
Add dynamic provider loading (GitHub/local/npm), URL parsing, git-backed provider cache/clone/update manager, entry-point resolution, cache cleanup, interface validation and docs.
Provider tests & fixture
src/model/cloud-runner/tests/providers/*, src/model/cloud-runner/tests/fixtures/invalid-provider.ts
New unit tests for provider-loader, provider-git-manager, provider-url-parser and an invalid-provider fixture.
AWS client factory & wiring
src/model/cloud-runner/providers/aws/aws-client-factory.ts, src/model/build-parameters.ts, src/model/cloud-runner/options/cloud-runner-options.ts
Introduce AwsClientFactory with lazy singleton clients per service, per-service endpoint configuration, credential resolution, and extend BuildParameters/CloudRunnerOptions with AWS endpoint and storage fields.
AWS provider refactors
src/model/cloud-runner/providers/aws/*, src/model/cloud-runner/providers/aws/services/*, src/model/cloud-runner/providers/aws/index.ts, src/model/cloud-runner/providers/aws/aws-task-runner.ts
Replace inline AWS client instantiation with AwsClientFactory, add retries/backoff/throttling handling, adjust task-runner/service flows and logging, and remove static client fields.
Shared workspace & storage backends
src/model/cloud-runner/services/core/shared-workspace-locking.ts, src/model/cloud-runner/services/hooks/container-hook-service.ts
Replace FS locking with S3/rclone backends, add ensureBucketExists/listObjects abstractions, rclone flows, and conditional tooling/credential gating in hooks.
CloudRunner core & exports
src/model/cloud-runner/cloud-runner.ts, src/model/index.ts
Aggregate AWS endpoints, detect LocalStack and remap provider to local-docker when appropriate, call dynamic loader, and re-export loader APIs.
Kubernetes & Docker adjustments
src/model/cloud-runner/providers/k8s/kubernetes-job-spec-factory.ts, src/model/cloud-runner/providers/k8s/kubernetes-pods.ts, src/model/cloud-runner/providers/docker/index.ts, src/model/docker.ts
Remap localhost endpoints for k3d/docker, add terminationGracePeriodSeconds, improve pod diagnostics, reduce ephemeral storage request, adjust cache copying, and inject host.docker.internal when required.
Local runner Windows handling
src/model/cloud-runner/providers/local/index.ts, types/shell-quote.d.ts
Add Windows-specific command sanitization/quoting using shell-quote and provide typings.
Remote client robustness & logging
src/model/cloud-runner/remote-client/*, src/model/cloud-runner/remote-client/remote-client-logger.ts
Harden git fetch/checkout/LFS, emit activation markers, guard cache pushes, make log paths Windows-safe, mirror k8s stdout, and improve multi-channel post-job log handling.
Log streaming & task params
src/model/cloud-runner/services/core/follow-log-stream-service.ts, src/model/cloud-runner/services/core/task-parameter-serializer.ts
Always append log lines for tests and inject selected AWS env vars into task environments.
Workflows & build automation
src/model/cloud-runner/workflows/*, src/model/cloud-runner/workflows/build-automation-workflow.ts, src/model/cloud-runner/workflows/async-workflow.ts
Container-aware builder paths, resilient branch-aware cloning, conditional toolchain steps, expanded logging/workspace handling; helper signatures updated to accept container flag.
Jest & CI test config
jest.ci.config.js, jest.config.js, jest.setup.js, .eslintrc.json, package.json
Consolidate test setup to jest.setup.js (node-fetch polyfill), add CI Jest config and test:ci script, add shell-quote/node-fetch deps, and ESLint override for jest.setup.js.
Integration & e2e tests
src/model/cloud-runner/tests/**/*.test.ts
Add environment-gated integration tests, switch to boolean BuildSucceeded assertions, enable cloudRunnerDebug in tests, add gating/cleanup logic for S3/rclone/local stacks.
CI workflows & maintenance
.github/workflows/*
Remove legacy monolithic CI pipeline; add callable integrity workflows (k8s/localstack), update integrity-check to call them, and comment out/adjust AWS-related steps.
Container hook runtime & step failure control
src/model/cloud-runner/services/hooks/container-hook.ts, src/model/cloud-runner/workflows/custom-workflow.ts
Add allowFailure flag to ContainerHook and per-step try/catch to continue when allowed.
Misc. utilities & fixes
src/model/github.ts, src/model/image-environment-factory.ts, src/model/versioning.test.ts, tsconfig.json
Wire node-fetch into Octokit, update Actions dispatch calls, unify env var handling and dedupe additionalVariables, skip grep test on Windows, and expand TS includes.
Lint tweaks & small changes
.eslintrc.json, several src/model/cloud-runner/providers/aws/*.ts
ESLint override for jest.setup.js and inline eslint-disable comments in AWS files.
Large removals
.github/workflows/cloud-runner-ci-pipeline.yml
Removed legacy monolithic Cloud Runner CI pipeline workflow file.
Disk-space aware caching & log fallbacks
src/model/cloud-runner/remote-client/caching.ts, src/model/cloud-runner/providers/k8s/kubernetes-task-runner.ts
Add disk-space preflight, cleanup/retry logic for tar creation, and kubectl logs fallback/read-file strategies.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as CloudRunner.setup()
    participant CR as CloudRunner
    participant Loader as ProviderLoader
    participant GitMgr as ProviderGitManager
    participant Provider as Provider
    participant Factory as AwsClientFactory

    Caller->>CR: setup(buildParameters)
    CR->>CR: aggregate endpoints, detect LocalStack
    alt LocalStack detected
        CR->>CR: remap provider -> local-docker
    end
    CR->>Loader: loadProvider(providerSource, buildParameters)
    Loader->>Loader: parseProviderSource()
    alt GitHub source
        Loader->>GitMgr: ensureRepositoryAvailable(urlInfo)
        GitMgr-->>Loader: localPath
        Loader->>Loader: resolve provider module path
    end
    Loader->>Loader: dynamic import & validate ProviderInterface
    opt provider uses AWS
        Loader->>Factory: request AWS clients (per-service endpoints)
        Factory-->>Loader: client instances
    end
    Loader->>Provider: instantiate Provider(buildParameters)
    Loader-->>CR: provider instance
    CR->>Provider: setupWorkflow()
sequenceDiagram
    participant Build as Build Job
    participant Lock as SharedWorkspaceLocking
    participant S3 as AWS S3
    participant Rclone as Rclone Remote

    Build->>Lock: CreateWorkspace(cacheKey)
    alt storageProvider = s3
        Lock->>S3: HeadBucketCommand
        alt bucket missing
            Lock->>S3: CreateBucketCommand
        end
        Lock->>S3: PutObjectCommand (workspace entry)
    else storageProvider = rclone
        Lock->>Rclone: rclone touch (workspace entry)
    end
    Lock-->>Build: workspace ready
    Build->>Lock: ListObjects(prefix)
    alt s3 backend
        Lock->>S3: ListObjectsV2Command
        S3-->>Lock: object list
    else rclone backend
        Lock->>Rclone: rclone lsf
        Rclone-->>Lock: file list
    end
    Lock-->>Build: contents

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Files/areas needing extra attention:

  • Provider loader, provider-url-parser, ProviderGitManager (regex parsing, dynamic import safety, cache lifecycle).
  • AwsClientFactory and AWS provider changes (endpoint precedence, credential resolution, retry/backoff, throttling).
  • SharedWorkspaceLocking and container-hook-service (S3 vs rclone parity, ensureBucketExists, key conventions).
  • AWSTaskRunner / TaskService (retry logic, type generalization, Kinesis/ECS interactions).
  • Remote-client caching disk-space logic and Kubernetes/docker log-stream fallbacks and recovery flows.
  • Integration tests and CI workflow changes that gate execution and change BuildSucceeded semantics.

Possibly related PRs

  • game-ci/unity-builder#698 — overlapping AWS provider/client refactors and aws-sdk-related changes.
  • game-ci/unity-builder#724 — related Jest fetch polyfill and CI/test adjustments.

Suggested labels

codex

Suggested reviewers

  • webbertakken
  • AndrewKahr
  • cloudymax

Poem

🐇 I hopped through branches, cloned each provider name,
Cached them snug, and fetched the clients all the same,
Locks now whisper S3 or rclone in the night,
Tests run with fetch and workflows called just right,
A cheerful rabbit nods — the repo hums bright.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: Cloud Runner improvements including S3 locking, LocalStack support, testing improvements, rclone storage, and a provider plugin system.
Description check ✅ Passed The description follows the template with sections for Changes, Related Issues, Related PRs, Successful Workflow Run Link, and Checklist. Changes are well-organized with subsections for features, improvements, documentation, and chores.
✨ Finishing touches
  • [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment
  • [ ] Commit unit tests in branch cloud-runner-develop

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Sep 08 '25 14:09 coderabbitai[bot]

Cat Gif

github-actions[bot] avatar Sep 08 '25 14:09 github-actions[bot]

Codecov Report

:x: Patch coverage is 16.68122% with 954 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 33.38%. Comparing base (0c82a58) to head (eee8b4c).

Files with missing lines Patch % Lines
...oud-runner/providers/k8s/kubernetes-task-runner.ts 0.00% 181 Missing and 20 partials :warning:
src/model/cloud-runner/remote-client/index.ts 0.00% 135 Missing and 12 partials :warning:
src/model/cloud-runner/remote-client/caching.ts 0.00% 112 Missing and 12 partials :warning:
...del/cloud-runner/providers/provider-git-manager.ts 9.90% 81 Missing and 10 partials :warning:
...d-runner/services/core/shared-workspace-locking.ts 6.81% 77 Missing and 5 partials :warning:
...odel/cloud-runner/providers/k8s/kubernetes-pods.ts 0.00% 67 Missing and 7 partials :warning:
...odel/cloud-runner/providers/aws/aws-task-runner.ts 6.81% 38 Missing and 3 partials :warning:
...loud-runner/providers/aws/services/task-service.ts 13.88% 29 Missing and 2 partials :warning:
...loud-runner/workflows/build-automation-workflow.ts 0.00% 28 Missing and 3 partials :warning:
...l/cloud-runner/providers/aws/aws-client-factory.ts 23.52% 21 Missing and 5 partials :warning:
... and 16 more
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #731      +/-   ##
==========================================
- Coverage   38.37%   33.38%   -4.99%     
==========================================
  Files          78       83       +5     
  Lines        3171     4151     +980     
  Branches      665      965     +300     
==========================================
+ Hits         1217     1386     +169     
- Misses       1809     2538     +729     
- Partials      145      227      +82     
Files with missing lines Coverage Δ
src/model/build-parameters.ts 90.00% <ø> (ø)
...model/cloud-runner/options/cloud-runner-options.ts 91.42% <100.00%> (+1.10%) :arrow_up:
...model/cloud-runner/providers/aws/aws-base-stack.ts 10.71% <ø> (ø)
.../model/cloud-runner/providers/aws/aws-job-stack.ts 13.63% <ø> (ø)
...el/cloud-runner/tests/fixtures/invalid-provider.ts 100.00% <100.00%> (ø)
src/model/cloud-runner/workflows/async-workflow.ts 27.77% <ø> (ø)
src/model/index.ts 100.00% <100.00%> (ø)
...odel/cloud-runner/providers/provider-url-parser.ts 97.87% <97.87%> (ø)
...-runner/services/core/follow-log-stream-service.ts 22.72% <0.00%> (+0.50%) :arrow_up:
...-runner/services/core/task-parameter-serializer.ts 73.80% <66.66%> (-0.55%) :arrow_down:
... and 23 more
:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar Sep 09 '25 20:09 codecov[bot]