content-o-tron icon indicating copy to clipboard operation
content-o-tron copied to clipboard

Create a curation tool

Open booyaa opened this issue 7 years ago • 23 comments

Specification

A tool is required to extract links from a specific tracking issue for a campaign.

An example of these tracking issue can be see on #6.

The tool should be able to extract the links, de-dupe and save as an RSS file (XML). At a minimum it should extract the blog post page title, blog post link, publication date.

We expect the tool to be run repeatedly, if the RSS file already exists it should append new items to the end of the RSS file.

Additional information

Bonus if written in Rust, but any language that can be added to an integration service like travis-ci would be considered.

Mentor: @skade


Mentoring can be provided, just ask. Please add comment if you are claiming this issue so we can assign it to you.

booyaa avatar Apr 19 '18 11:04 booyaa

I would be willing to mentor this. This is a good first issue for someone wanting to write a whole (small) project in Rust.

skade avatar May 10 '18 13:05 skade

Would be very interested in writing this tool 😃 in rust

fourplusone avatar May 10 '18 14:05 fourplusone

This sounds like lots of fun, could I give it a try ?

I'd probably pick an http client such as Actix's or Hyper's, to (async) crawl the page and extract links into a BlogPost struct, then use serde to write / append the xml.

I'll have a look at the RSS spec, could I try to work on this ? :)

o0Ignition0o avatar May 10 '18 15:05 o0Ignition0o

Oh sorry @fourplusone, please go ahead if you want to :)

o0Ignition0o avatar May 10 '18 15:05 o0Ignition0o

I would actually recommend just to use reqwest. :+1:

Can you please join our gitter channel? That might be easiest.

If you have any questions on the specification, please post them here, so that everyone sees them.

skade avatar May 10 '18 15:05 skade

I just joined. Should this tool go into a separate repo or should it be part of this one?

fourplusone avatar May 10 '18 15:05 fourplusone

@fourplusone separate repos is probably the easiest

skade avatar May 10 '18 15:05 skade

My WIP implementation for this can be found here: https://github.com/fourplusone/curate-issue

fourplusone avatar May 10 '18 20:05 fourplusone

Here is a status update of the curate-issue tool.

  • [x] Extracts links from Github Issues + Comments
  • [x] Is able to extend existing RSS Feeds
  • [x] Detects duplicates
  • [x] Extracts Post Date & Title from (most) blog posts
  • [x] Compiles without warnings
  • [x] Has a few test cases
  • [x] Documentation of the code & what it does
  • [x] Moving out more stuff from main.rs
  • [ ] Unit Tests which do not rely on Github / Example blog posts
  • [ ] Some sort of caching to avoid visiting every page being linked

If you are missing any point, please let me know

fourplusone avatar May 12 '18 12:05 fourplusone

Awesome ! If you need help or would like me to review it, please let me know :)

o0Ignition0o avatar May 12 '18 12:05 o0Ignition0o

@o0Ignition0o I'd be glad if you would review some code or improve it

Thanks for your help 👍

fourplusone avatar May 12 '18 12:05 fourplusone

I think this is ready for testing. Can someone create a repo on /rust-community where the code will find its new home?

fourplusone avatar May 19 '18 11:05 fourplusone

If you need any help in adding this tool to a .travis.yml file, just let me know

fourplusone avatar May 19 '18 11:05 fourplusone

I'll do some testing, will also raise this with the community team about transferring the repo. as this has implications regarding on-going maintenance.

Thanks again for your work!

booyaa avatar May 22 '18 13:05 booyaa

As discussed with @adityac8 on the irc yesterday, who will be testing the tool and providing feedback to @fourplusone. cc @wezm

17:47 <@booyaa> adityac8: do you want to try the curation tool against the posts you've collected for rustreach?
17:48 < adityac8> Sure. I would love to give that a try.
17:48 <@booyaa> we should raise an issue with readrust's author @wezm just to let him know we're going to do this. he might be able to make our curated posts stand out or create
                a category like "content-o-tron"?

booyaa avatar May 23 '18 08:05 booyaa

Might be a bit late given the state of the tool but some of the work I did for Read Rust might be relevant. Specifically the add-url tool and feed finder crate

I’ll give the tool a look when I have a moment and work out a good way to surface posts that are part of a campaign.

wezm avatar May 23 '18 22:05 wezm

The feed finder crate looks very useful. I think i will integrate this in an upcoming release.

fourplusone avatar May 24 '18 17:05 fourplusone

@fourplusone Discuss this with the whole community team, we're happy for you to transfer the ownership of the repo if you still want to do it. Just let us know when it's been done. Thanks!

booyaa avatar Jun 20 '18 07:06 booyaa

@badboy I think so we should be transferring this one to rust-community as well 😄 cc @fourplusone @booyaa

adityac8 avatar Jul 06 '18 13:07 adityac8

👋 @badboy is there anything you need from me in order to transfer the repo?

fourplusone avatar Jul 18 '18 15:07 fourplusone

@fourplusone Simply transfer the repository to me and I will transfer it to the organization.

badboy avatar Jul 18 '18 15:07 badboy

Done: https://github.com/rust-community/curate-issue

badboy avatar Jul 18 '18 16:07 badboy

And also enabled Travis now: https://travis-ci.org/rust-community/curate-issue

badboy avatar Jul 18 '18 16:07 badboy