fides icon indicating copy to clipboard operation
fides copied to clipboard

Design approach for storing SaaS-config parameterizations (e.g. external dataset references)

Open adamsachs opened this issue 3 years ago • 0 comments

Is your feature request related to a specific problem?

Implementation of dataset references (https://github.com/ethyca/fides/pull/1269) brought up questions about how we maintain and leverage use-case-specific parameterizations to generic SaaS configurations. The initial implementation essentially just bolted on some support to existing constructs and our existing framework. We should think more holistically about a more sustainable, general approach for maintaining and leveraging use-case-specific parameterizations to SaaS configurations moving forward. Ideally, these parameterizations should be properly separated from the generic pieces of the SaaS configuration, i.e. the pieces that apply to all use cases.

Describe the solution you'd like

  • A sustainable way to support use-case-specific parameterizations of certain SaaS configuration elements.
    • Users should be able to easily make these parameterizations for their SaaS config instance, whether through the UI or through API
    • It should be clear, at least to the SaaS framework, which aspects a given SaaS config instance have been parameterized per use-case, rather than part of the generic SaaS template. Keeping this separation will be helpful to maintain straightforward upgrade and update logic. It may also be helpful from a user perspective, so that users are clear on which pieces of their config are "safe" to touch, and which are meant to be generic/system-maintained.

Describe alternatives you've considered, if any

  • injecting the external dataset reference (or any new parameterized info that comes up) at connector instantiation time, similar to how we parameterize the instance_fides_key for SaaS configs already

Additional context

Our initial implementation for supporting parameterized "external dataset references" in https://github.com/ethyca/fides/pull/1269 introduced some tech debt into our core saas execution workflow. Namely, references in ParamValues can now refer to actual dataset references, or they can point to a parameterized pointer, that's dereferenced from the connector's secrets and parameterized at execution time. Having to account for both of those possibilities has led to some suboptimal code, as described in @pattisdr's comment on the PR. But we decided to move forward with this approach for now, because the functionality is time-sensitive, and refactoring would be more of an endeavor than we could afford at the time.

An alternative was proposed in the above comment to inject the parameterization at the time of connector instantiation. That would eliminate some of the tech debt introduced by the approach we chose, but it comes with some additional concerns, as described in @adamsachs's response comment.

Here, we should evaluate pros/cons of these approaches, and consider additional approaches. We should think about not just "external dataset references", but also consider that other pieces of the SaaS config framework may need to be parameterized per use-case moving forward.

adamsachs avatar Oct 12 '22 17:10 adamsachs