data.gov
data.gov copied to clipboard
Evaluate catalog-next harvest state
User Story
In order to know if catalog-next has enough data to go live, data.gov owners want a spreadsheet detailing the state of the catalog, and the stats compared to catalog-next.
Acceptance Criteria
- [ ] GIVEN a google sheet exists detailing the catalog state (organizations and total datasets)
WHEN a harvest is completed in harvesting2.0 into catalog-next
AND an update job is run THEN the spreadsheet is updated
AND the differences are analyzed/highlighted
Background
We did this when moving the original catalog-next back in the day, it was very helpful for combining stats with business knowledge from Hyon and others on what was important to address and what could be left behind.
Security Considerations (required)
None
Sketch
- Build a sheet detailing the current catalog, with columns for the organization and the dataset count
- Build this using a dynamic API call, should be straightforward but need to use paging
- Add to the sheet with dataset count for the same organization on catalog-next
- Build using same dynamic API call, but match organization names
Final step: get QA from Hyon (or equivalent) for any ticket(s) that need to be created from this analysis.