rill
rill copied to clipboard
Runtime: Support persistent storage for distributed workers
As we look toward separating ETL and serving, and running ETL on worker nodes, the runtime will need the ability to persist ETL state in GCS.
Requirements:
- Use a system-wide GCS connection for ETL state for all instances (to make configuration easier and to prevent corruption).
- Ensure complete isolation of data between instances
- Ability to track/limit usage per instance
- On local, ability to use a sub-directory of the
tmpdirectory for ETL state instead of GCS
Tasks:
- [ ] Design proposal (one idea is to leverage
SystemConnectorsand configure it to a local file driver on local and GCS on cloud)