jetstream icon indicating copy to clipboard operation
jetstream copied to clipboard

Support for running confidential experiments

Open scholtzan opened this issue 2 years ago • 1 comments

There are certain types of experiment where as much information as possible needs to be kept confidential and/or confidential data is accessed for running the analysis. Jetstream currently does not support these types of experiments

What considerations do we need to make on the jetstream side?

  • jetstream-config-private repository, for config files that encode confidential information
    • Adding new config files will require a review before it can get merged
  • Config changes to indicate that an experiment is accessing or producing restricted data
  • Separate Airflow instance [?]
    • Might not be strictly necessary because analysis is run through Argo
    • Also unlikely to happen
  • Separate Argo instance
    • With service account having access to restricted data
    • Requires SRE input
  • Separate BigQuery location to write results to
    • Access managed through workgroups
    • Requires SRE input
  • Results and metadata are not being published to GCS and not displayed in Experimenter/partybal
    • Visualizations/reports will need to be made manually, e.g. in Looker
    • Make datasets available in Looker

Short term workaround

  • Running jetstream locally and storing results in a private sandbox project

┆Issue is synchronized with this Jira Task

scholtzan avatar Jul 29 '22 16:07 scholtzan