unconf15 icon indicating copy to clipboard operation
unconf15 copied to clipboard

Better Blog Aggregation

Open eddelbuettel opened this issue 11 years ago • 41 comments

A few of us have exchanged comments or notes about the need for a better aggregation of blog activity for the R community: higher quality, proper formatting, advertisement-free, possibly curated, ... Might be worthwhile to stick our heads together while we are in one place.

eddelbuettel avatar Feb 11 '15 02:02 eddelbuettel

I didn't have the bandwidth to respond to that email thread but I'll upvote this here (and better suited for an inperson discussion anyway). I think we can easily put together a minimalist, and readable ad-free aggregator.

Instead of manually adding blogs (and sometimes waiting for months for a single lax maintainer to respond) we could automate some of this. We could easily ask users to place text file on their server with boilerplate text granting permission for us to aggregate (then letting us know via a form). As long as the file is there, we continue to aggregate and can stop anytime they pull the plug.

If we do this, it would be great to improve upon the tagging so one does not have to endure every mundane R post (or perhaps only keep tabs on say Shiny posts).

@ironholds has offered to host for us.

karthik avatar Feb 11 '15 02:02 karthik

Nice idea re the poke for permissions file and continue aggregation while present. I like that.

Pinging @elijah who had poked around GitHub as well. Also as a footnote to myself as I keep forgetting that it was rawdog which I like as a simple (rss in, static files out) aggregator with plugins.

eddelbuettel avatar Feb 11 '15 02:02 eddelbuettel

+1 to...well, everything Karthik said. I don't have the time bandwidth, but I'll happily provide the hosting and cover the costs.

Ironholds avatar Feb 11 '15 03:02 Ironholds

Hey - finally getting around to replying to this thread of thoughts.

A few weeks ago, I was tinkering with https://github.com/planetr/planetr.github.io -- i even got as far as making a planetr project to fiddle with. Happy to add folks to that project so they can commit, or take pull requests, or whatever.

My first pass at it was to just use Jekyll to publish posts pulled down by the planet.rb gem. You can see my experiment from back in December at planetr.github.io.

It should be totally possible for someone to hack on the templates, or add features, etc, and then just make pull requests to the repo, and have other people review and check it in. Want your blog added? Make a pull request.

Dirk likes rawdog, and I get that -- it looks pretty awesome -- you could easily implement this same thing with it, I suspect. [I just haven't tried - a parallel implementation checked into the same or a parallel repo would be fun. :) ]

Democratizing the heck out of this is a thing that should happen. I was a blocker on the old implementation of planetr, and I don't want to be that. :)

What should probably happen is that a fairly regular cron job on a host somewhere should do this kind of thing:

git checkout master (or HEAD...); git pull; planet generate; git commit -a; git branch posts_date +'%m%d%Y' ; git checkout posts_date +'%m%d%Y' ; git push origin posts_date +'%m%d%Y'

Just so we can roll back if things get mangled, or something. [It's happened many times with the existing old planet-venus based planetr site -- usually when a disk gets full, or something.]

Bonus activity here would be to use the freebie infrastructure at https://travis-ci.org/ to do this work, and push it all back up to github.io....

elijah avatar Feb 11 '15 20:02 elijah

This is one of the threads that I followed when digging into this stuff in December:

http://blog.nilenso.com/blog/2013/09/16/octopress-planet-dot-rb-and-the-nilenso-blog/

elijah avatar Feb 11 '15 20:02 elijah

Want to encourage people to eyeball http://offog.org/code/rawdog/ - the rawdog homepage - and look at the list of plugins at the bottom. Those are the sorts of features that we want to have available -- it's not-unusual for people to use something like Vellum (http://www.kryogenix.org/code/vellum/) in such an installation to rewrite bits and pieces of posts. As desired. ;-)

elijah avatar Feb 11 '15 20:02 elijah

Very nice stuff, @elijah. I also lean towards using GitHub "because its there" and nobody needs to foot any bills. Spreading the effort wide is something I also think is desirable and can work well -- my prime example how @jereonooms made us all edit the useR! 2014 page that way.

Also not religious re rawdog but just like you impressed by the wide range of plugins. I speak next to no Ruby but can do some (modest) Python hacking. We'll chat some more...

eddelbuettel avatar Feb 11 '15 20:02 eddelbuettel

One possibility for blog aggregation is PressForward (GitHub), a WordPress plugin that aggregates content and creates a workflow for republishing posts. PressForward would lead to a more curated than automated approach, however.

lmullen avatar Feb 23 '15 16:02 lmullen

Hm. I am not sure we would want to live within a WordPress environment. Otherwise the review and curation idea is pretty close to one possible approach I entertained. But as @karthik outlines above, fully automatic robot mode is nice too ... as we all are bloody busy already.

eddelbuettel avatar Feb 23 '15 17:02 eddelbuettel

This is completely apropos and I just can't help myself:

https://twitter.com/sadserver/status/570260809399410688

"Statistically speaking it's more likely for you to be mauled by a bear than for you to properly secure WordPress."

;-)

--e

On Mon, Feb 23, 2015 at 11:16 AM, Dirk Eddelbuettel < [email protected]> wrote:

Hm. I am not sure we would want to live within a WordPress environment. Otherwise the review and curation idea is pretty close to one possible approach I entertained. But as @karthik https://github.com/karthik outlines above, fully automatic robot mode is nice too ... as we all are bloody busy already.

— Reply to this email directly or view it on GitHub https://github.com/ropensci/unconf/issues/9#issuecomment-75587071.

elijah avatar Feb 26 '15 01:02 elijah

That is priceless :) Any chance you;ll swing by SF for the unconf? Or do we have to G+ hangout/skype you in if we get going?

eddelbuettel avatar Feb 26 '15 01:02 eddelbuettel

I'd prefer something light like hacker news or reddit to aggregate blogs, but also potentially interesting articles, gists, SO questions, etc....

I think the key ingredient is a good algorithm to make good/fresh stuff float to the top... preferably with personalized weighting to counter the bias towards the always popular newbie/commercial crap.

jeroen avatar Feb 26 '15 03:02 jeroen

Just want to +1 this idea. It would be great to have one site to point to for R-related blog posts. Also, whatever is implemented here could be a useful strategy for other communities. I like @karthik's fully automated robot strategy as 'no curation' is the lowest weight implementation. Maybe something like Planet Python http://planetpython.org @elijah's 'want your blog added, submit a PR' might be an intermediate strategy. If it seemed like low quality was an issue, a curation or up vote system could be implemented later. For up vote, etc, I've liked the Advogato trust metric, but I know it has its issues.

tracykteal avatar Mar 04 '15 15:03 tracykteal

Yup. I think everybody likes the basic idea of Planet $WHATEVER and @elijah already used a Planet implementation, but possibly an older one as the Python one mentioned by @tracykteal.

eddelbuettel avatar Mar 04 '15 15:03 eddelbuettel

I'm a fan of this idea and would be interested in joining any discussions about it at the unconf.

hafen avatar Mar 20 '15 21:03 hafen

I still want this done and am hoping to work a little on a minimal solution, possibly GitHub based.

@elijah had done some more work poking around and probing some more (with email to). Anybody still got appetite for this?

eddelbuettel avatar Jul 09 '15 13:07 eddelbuettel

Something is brewing in the back of my mind... it's next up after mongo and gpg projects...

jeroen avatar Jul 09 '15 13:07 jeroen

Any desire to integrate this with www.r-pkg.org at some level? :) E.g. news.r-pkg.org? Just asking, no problem if not. :)

I am willing to help, even if we don't integrate. :)

gaborcsardi avatar Jul 09 '15 16:07 gaborcsardi

:+1:

I would so luuuv to have both of you on this as there is obviously so much js goodness around this.

As for r-pkg.org: "too visible" :) This is skunkworks for now, but when we have something we like we can surely make it more visible.

eddelbuettel avatar Jul 09 '15 16:07 eddelbuettel

:+1:

tracykteal avatar Jul 09 '15 16:07 tracykteal

Am also very interested in this and happy to help in any way.

timelyportfolio avatar Jul 09 '15 17:07 timelyportfolio

@eddelbuettel r-pkg.org has very little traffic actually. :)

But that's fine, I can certainly put it somewhere else as well. I'll listen to what Jeroen has to say.

If you a have framework in mind, I can put that on Dokku/DigitalOcean. Dokku is great, containerized microservices with a git-based workflow. Do-it-yourself Heroku essentially.

gaborcsardi avatar Jul 10 '15 08:07 gaborcsardi

Any desire to integrate this with www.r-pkg.org at some level? :)

That would be great and I'm very supportive of this.

karthik avatar Jul 10 '15 08:07 karthik

The old planetr.stderr.org was a planetplanet implementation - from about 2007, if memory serves. It wasn't very great but it has worked for a LONG time without any real maintenance work or time spent on it by me.

It'd be super cool to have things like customizable weightings and such, but that's a lot more work, and I've never actually seen an open implementation that was simple enough (e.g., 'boneheaded') enough to actually continue working over time.

Minimal dependencies are REALLY important for this sort of thing, btw. If it needs a RDBMS or something, it will bitrot and people will be sad.

--e

On Wed, Mar 4, 2015 at 9:36 AM, Dirk Eddelbuettel [email protected] wrote:

Yup. I think everybody likes the basic idea of Planet $WHATEVER and @elijah https://github.com/elijah already used a Planet implementation, but possibly an older one as the Python one mentioned by @tracykteal https://github.com/tracykteal.

— Reply to this email directly or view it on GitHub https://github.com/ropensci/unconf/issues/9#issuecomment-77180072.

elijah avatar Jul 10 '15 16:07 elijah

I missed the unconfy thing - not being very well plugged into R these last few years - but am happy to talk to folks / play with things anyone wants to grind on.

--e

On Wed, Feb 25, 2015 at 7:23 PM, Dirk Eddelbuettel <[email protected]

wrote:

That is priceless :) Any chance you;ll swing by SF for the unconf? Or do we have to G+ hangout/skype you in if we get going?

— Reply to this email directly or view it on GitHub https://github.com/ropensci/unconf/issues/9#issuecomment-76105042.

elijah avatar Jul 10 '15 16:07 elijah

I'm having a chance to hack on the idea I had of using travis-ci to noodle with the articles -- I have the freebie-implementation at travis-ci.org pulling the repo down and running jekylll tests on all of the bits that are in the _posts dir; I'm working now on having the build/test process actually do the pulling itself, so it can promote if all the content renders properly (and such) -- I got spurred into working on this when I ran the manual process to update posts (planet generate ...) and broke the build on github.io.

elijah avatar Jul 10 '15 18:07 elijah

@jeroenooms Can't we just do this on reddit? As a subreddit, where we automatically submit aggregated content to? I am not a big reddit user, so I have no idea.....

gaborcsardi avatar Jul 12 '15 01:07 gaborcsardi

@abresler know reddit well . Sounds like I need to know it.

timelyportfolio avatar Jul 12 '15 01:07 timelyportfolio

:-1:

The whole point is to not depend on someone/something else.

Remember: minimal implementation, self-contained, possibly on GitHub (or a cheap hosted site). See discussion above, notably posts by @elijah

eddelbuettel avatar Jul 12 '15 01:07 eddelbuettel

In that case I suggest a custom server, not Travis CI. I am not a big fan of Jekyll, either. :) A static site would not work anyway, if we want dynamic ranking.

gaborcsardi avatar Jul 12 '15 01:07 gaborcsardi