data.gov icon indicating copy to clipboard operation
data.gov copied to clipboard

Evaluate catalog-next harvest state

Open jbrown-xentity opened this issue 9 months ago • 0 comments

User Story

In order to know if catalog-next has enough data to go live, data.gov owners want a spreadsheet detailing the state of the catalog, and the stats compared to catalog-next.

Acceptance Criteria

  • [ ] GIVEN a google sheet exists detailing the catalog state (organizations and total datasets)
    WHEN a harvest is completed in harvesting2.0 into catalog-next
    AND an update job is run THEN the spreadsheet is updated
    AND the differences are analyzed/highlighted

Background

We did this when moving the original catalog-next back in the day, it was very helpful for combining stats with business knowledge from Hyon and others on what was important to address and what could be left behind.

Security Considerations (required)

None

Sketch

  • Build a sheet detailing the current catalog, with columns for the organization and the dataset count
    • Build this using a dynamic API call, should be straightforward but need to use paging
  • Add to the sheet with dataset count for the same organization on catalog-next
    • Build using same dynamic API call, but match organization names

Final step: get QA from Hyon (or equivalent) for any ticket(s) that need to be created from this analysis.

jbrown-xentity avatar May 20 '24 18:05 jbrown-xentity