backend icon indicating copy to clipboard operation
backend copied to clipboard

design media sitemap manual approval API support

Open rahulbot opened this issue 4 years ago • 0 comments

We've come up with a short term idea for shifting the sitemap ingest process to researchers. The idea is that web-users could request a source's sitemaps be fetched (via ultimate-sitemap-parser), download 100 random stories to review, then mark the source as one to fetch sitemaps on or not (after they review the stories).

Here's my first pass at what this would need at the API level. I think this could be modeled after the way the RSS scraping API methods work (ie. requesting and reviewing jobs to be done on a media source). In my head the API would need:

  • a way to request that sitemaps from a particular media_id be fetched and parsed for review - (POST media/<media_id>/sitemaps/scrape ?)
  • a way to see the status of any previous requests to fetch sitemaps for review - (GET media/<media_id>/sitemaps/status ?)
  • a way to see ~100 random pages from the latest completed fetch of sitemaps (GET media/<media_id>/sitemaps/preview ?)
  • a way to mark a media source as approved to import stories from sitemaps or not (just add a boolean use_sitemaps property to PUT media/update?)

Does that seem right? If so, what would the back-end implementation look like?

rahulbot avatar Mar 27 '20 17:03 rahulbot