jetstream
jetstream copied to clipboard
Support for running confidential experiments
There are certain types of experiment where as much information as possible needs to be kept confidential and/or confidential data is accessed for running the analysis. Jetstream currently does not support these types of experiments
What considerations do we need to make on the jetstream side?
- jetstream-config-private repository, for config files that encode confidential information
- Adding new config files will require a review before it can get merged
- Config changes to indicate that an experiment is accessing or producing restricted data
- Separate Airflow instance [?]
- Might not be strictly necessary because analysis is run through Argo
- Also unlikely to happen
- Separate Argo instance
- With service account having access to restricted data
- Requires SRE input
- Separate BigQuery location to write results to
- Access managed through workgroups
- Requires SRE input
- Results and metadata are not being published to GCS and not displayed in Experimenter/partybal
- Visualizations/reports will need to be made manually, e.g. in Looker
- Make datasets available in Looker
Short term workaround
- Running jetstream locally and storing results in a private sandbox project
┆Issue is synchronized with this Jira Task