web-monitoring-db icon indicating copy to clipboard operation
web-monitoring-db copied to clipboard

Ingest the legacy uuids from the old versionista outputter

Open danielballan opened this issue 7 years ago • 8 comments

It would be nice to have a way to associated legacy Annotations, which I assume will be subjected to a lot of analysis, with Versions in our app. Somehow getting the uuids generated by the old outputter (and now stored only in Google Sheets, I think) sounds slightly painful but possible and useful.

danielballan avatar May 31 '17 23:05 danielballan

Some notes for this:

Since rows in those spreadsheets are really versions, this is should probably be matching Versionista version IDs. We can extract them from:

  • In the spreadsheets: the Last Two - Side by Side column is always https://versionista.com/{site}/{page}/{version_id}:0 (version_id is universally unique within Versionista, so all the other fields can be ignored)
  • In the DB: version_record.source_metadata. version_id

The naive way someone could do this now would be to page through all the results of https://web-monitoring-db.herokuapp.com/api/v0/versions?source_type=versionista

If we wanted to better support this, we could:

  • Add some indexing on source_metadata.versionista_id and allow querying by that field or
  • Make public DB exports available (#45)

Alternatively, a different, probably easier approach might be to create an API endpoint for uploading analyst annotation CSVs. It’s kinda messy, but might be the easiest and quickest way to achieve this.

Mr0grog avatar Jun 01 '17 00:06 Mr0grog

Using the versionista ID is good enough to support the ad hoc analysis I want to do right now. Once we transition away from versionista, perhaps we should do a one-time update to the database to ingest all these legacy UUIDs and associated Annotations.

danielballan avatar Jun 01 '17 21:06 danielballan

:+1:

Mr0grog avatar Jun 01 '17 21:06 Mr0grog

As we move forward with having different differs as well, an annotation imported from sheets should have a field indicating that’s where it came from, (Versionista, Scanner, possibly others in the future.)

weatherpattern avatar Feb 15 '18 22:02 weatherpattern

I don't think we'll need to worry about that when importing sheets of annotations. Each annotation (row in the sheets) already refers to a Version in our database, either by its Versionista ID or by web-monitoring-db UUID or both, and each Version already knows where it came from.

danielballan avatar Feb 16 '18 13:02 danielballan

Note: the tooling for this was added in #233. Solving this is mainly a matter of executing that rake task regularly (or, more complex: setting up a job that does that work on a schedule).

Mr0grog avatar Mar 02 '18 01:03 Mr0grog

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in seven days if no further activity occurs. If it should not be closed, please comment! Thank you for your contributions.

stale[bot] avatar Jan 10 '19 00:01 stale[bot]

This will be a requirement for migrating away from Google sheets for important changes.

Mr0grog avatar Jan 10 '19 05:01 Mr0grog