web-monitoring-db icon indicating copy to clipboard operation
web-monitoring-db copied to clipboard

Add support for archiving DB to SQLite

Open Mr0grog opened this issue 1 year ago • 0 comments

⚠️ Work in progress! ⚠️

This adds a rake command to export the contents of the DB into a SQLite file for public archiving. It's mostly a pretty straightforward copy of every table/row, but we skip tables that are irrelevant for a public data set (administrative things like GoodJob tables, users, imports, etc.), drop columns with user data, and do some basic conversions.

Part of edgi-govdata-archiving/web-monitoring#170

For changes/annotations, we probably want to just select relevant annotations, like the important changes (make sure we have them all in the DB first, see https://github.com/edgi-govdata-archiving/web-monitoring-processing/blob/main/web_monitoring/cli/annotations_import.py), and only import those and the changes they apply to.

Mr0grog avatar Aug 10 '23 21:08 Mr0grog