fides icon indicating copy to clipboard operation
fides copied to clipboard

Alembic Integration

Open vtumbleshack opened this issue 3 years ago • 2 comments
trafficstars

Is your feature request related to a specific problem?

Our team goes through the following process when changing or creating new db objects:

  1. Change the SQL Alchemy files & change associated application code
  2. Test changes locally by using Alembic to create the db migration files
  3. Push the changes into a PR & merge

Fides adds another command that developers must execute to generate the annotations yaml file.

Describe the solution you'd like

If Fides integrated with Alembic, one command could generate both the db migration files and generate or edit the privacy annotation templates.

Describe alternatives you've considered, if any

We're considering writing our own alembic wrapper which also runs Fides.

Additional context

Before deploying Fides, we're concerned developers will simply forget the privacy annotation step. We can add a reminder to our CI pipeline, but that's an extra feedback cycle (slows dev pace). If template generation happened automatically with db migration generation, developers would see the empty file when closing which files to stage.

vtumbleshack avatar Aug 09 '22 20:08 vtumbleshack

Hi @tumblekada , thank you for the feature suggestion!

I wrote up a design doc for this awhile ago but instead of wrapping around Alembic specifically I assumed it would ingest the SQL Alchemy models directly. It sounds like you're wanting it to wrap alembic so that there is only one command needed?

Can I ask you what the general flow is for your usage of Fides?

In this project (it's very meta, I know) we use the fides scan db command to automatically catch when a developer has made a database update but forgotten to update the manifest files. Like you said though, at that point it would still require running an extra fides generate command to get it updated and passing.

ThomasLaPiana avatar Aug 11 '22 03:08 ThomasLaPiana

Hi @ThomasLaPiana , thanks for the quick reply.

Yes, there's two features here.

  1. Optionally reading from SQL Alchemy instead of the database (would be awesome!)
  2. Removing the burden of an extra fides command.

Unless Fides reads from Alchemy, it would have to wait for the migrations to actually execute.

As far as how this interface looks, I can think of a few strategies:

  • Fides adds a config where executables can hook into the various events (much like git hooks)
  • Fides forks alembic and integrates itself to alembic's internal tracking of SQL changes
  • Fides adds a CLI flag to run alembic under the hood (and associated arguments which get passed to alembic)
  • A new CLI executable wraps both Fides (with Alchemy support) and Alembic

Our team does not use Fides yet, but encountered this feature idea when trying to evaluate the potential developer buy-in (or lack thereof) if we do deploy Fides. We think that giving them another command to run as part of their workflow will reduce participation.

vtumbleshack avatar Aug 11 '22 19:08 vtumbleshack

Thank you for the clarification @tumblekada!

For option 1, I'm in total agreement, I think ingesting the SQL Alchemy models directly is a great idea and one we're going to get on the roadmap!

For option 2, I understand the pain here. As you mentioned it got me thinking, maybe adding a git hook to do the generation would ease the engineers' burden? I see the other 3 options as relatively heavy, requiring a more deep (and therefore fragile) integration between the two tools.

Does your team use any kind of build tool to simplify any other commands by chance? For instance a stop-gap solution might be adding a make target that wraps these commands directly without them needing to know about each other (assuming you use Make or something similar)

ThomasLaPiana avatar Aug 16 '22 07:08 ThomasLaPiana