rweekly.org
rweekly.org copied to clipboard
GH Action to gather content
The curation process as it stands currently involves:
- collecting the last 10 days of RSS entries via
get_rss_posts()
- collecting the last week or so of CRANberries new and updated via
process_cranberries()
- de-duplicating (within the draft and from the last 20 issues)
- adding content found elsewhere
- curating posts - filtering out irrelevant/low-quality content and categorising
I believe the first 3 of these can be automated, potentially with a GitHub Action, performed on a weekly schedule. Getting that content into the draft itself is a minor addition, but collecting that content in the first place, even into a committed plaintext file, could help editors get closer to a draft, faster.
I think I'm able to prototype this myself, but this issue can serve as a place for discussion about improvements or concerns.
The prototype works! https://github.com/rweekly/rweekly.org/blob/gh-pages/curatinator_latest.md?plain=1 (I forgot to add linebreaks, but the concept is sound).
I'll add to this the collection of CRANberries and de-duplication. It's set to run at 9am Saturday UTC each week, but can also be triggered manually in the Actions tab on github.
I'm quite happy with that! This now fetches the RSS feeds and CRANberries, de-duplicates, and saves to curatinator_latest.md
for copying over to the draft. Still requires the deup from past issues but I wasn't sure how to easily excise those.
This is really nice @jonocarroll! I've felt a sense of discontent each time I've curated since the loss of our infrastructure but lacked the initiative to do something about it, so I'm really appreciative of this step to remove some of the inefficiency in our process.
Doing it via a GH action is also really nice since it keeps it transparent for everyone and facilitates maintenance / collaboration / iteration.
Looks great to me! This will save me some time during my curation weeks.