rill icon indicating copy to clipboard operation
rill copied to clipboard

Runtime: Support persistent storage for distributed workers

Open begelundmuller opened this issue 2 years ago • 0 comments

As we look toward separating ETL and serving, and running ETL on worker nodes, the runtime will need the ability to persist ETL state in GCS.

Requirements:

  • Use a system-wide GCS connection for ETL state for all instances (to make configuration easier and to prevent corruption).
  • Ensure complete isolation of data between instances
  • Ability to track/limit usage per instance
  • On local, ability to use a sub-directory of the tmp directory for ETL state instead of GCS

Tasks:

  • [ ] Design proposal (one idea is to leverage SystemConnectors and configure it to a local file driver on local and GCS on cloud)

begelundmuller avatar Jan 16 '24 12:01 begelundmuller