js-metacpan-org icon indicating copy to clipboard operation
js-metacpan-org copied to clipboard

Integrate CPAN testers report through their JSON interface?

Open ranguard opened this issue 14 years ago • 14 comments

http://www.cpantesters.org/distro/D/Data-Pageset.json for example

ranguard avatar Mar 30 '11 17:03 ranguard

http://search.metacpan.org/#/showpod/CPAN::Testers::Reports::Query::JSON

oalders avatar Mar 30 '11 20:03 oalders

Thinking about this more - it would be nicer to have this via the API - but maybe that's what you mean't :)

ranguard avatar Mar 31 '11 17:03 ranguard

Yes, that would actually be really helpful. There is this cryptic issue already open, but there's no real plan around it:

https://github.com/CPAN-API/cpan-api/issues/#issue/38

Getting all of the tester data in there might be a massive job and there's the question of how often to update, but I think it's very much worth looking at -- even if it's only summary data in the API. MetaCPAN could be one API to rule them all. Would be nice if you could get X different kinds of data without having to learn X different APIs, feeds etc. :)

oalders avatar Mar 31 '11 21:03 oalders

It depends what summary information you want, but I already store a summary in the DB to make loading the web pages a little faster. The only downside is that it isn't necessarily up to date, as the builder is currently 36 hours behind the oldest outstanding report submission. I could make this available via a dedicated request though.

barbie avatar Apr 01 '11 13:04 barbie

I certainly can live with a 36 hour lag for sure and a dedicated request would be splendid. :) Would be great to get this information into the index. I can see that it would be very useful to a lot of people.

oalders avatar Apr 01 '11 15:04 oalders

The summary is stored as JSON, so it should be simple enough to query an author or a distro and get the block of JSON return quite quickly. I'll sort a prototype out next week for you. If the summary needs anything additional for you, I can look at sorting that out too.

barbie avatar Apr 03 '11 09:04 barbie

That sounds very good. Looking forward to it!

oalders avatar Apr 04 '11 18:04 oalders

There used to be an IRC channel where each test result was propagated. With CPAN Testers 2.0 this has been removed due to too much traffic, I guess. This kind of stream would have been ideal for our purposes. We could have joined the IRC channel with an bot and updated the test count (failing/pass/na) in real time.

Are there any plans to add something like this to the current CPAN Testers API? E.g. long-polling http requests or something. I'd be happy to help with that!

monken avatar Apr 08 '11 10:04 monken

There are about 5,000 reports a day usually, so a stream would be difficult to manage. There is a tail log of the submissions, but it is probably more appropriate to lookat specific distros and authors that you are interested in.

Sorry haven't finished the summary API yet. My vodafone dongle broke this week, so can't work on the server while on the bus at the moment :( Will try and get something up and running tomorrow while watching the Grand Prix ;)

barbie avatar Apr 09 '11 11:04 barbie

Hi!

I was planning to use the test results for the distribution ranking algorithm (i.e. rank releases with bad results lower). And for this to work properly I need all results in the ElasticSearch instance. I'm not sure where search.cpan.org does get the data from. But they seem to be pretty up2date (some hours of delay).

Real-time updates would be perfect, but I do see the technical challenge on the cpan testers side.

monken avatar Apr 11 '11 20:04 monken

search.cpan.org gets them from the cpanstats SQLite database available from the development site: http://devel.cpantesters.org. The DB is updated every 6 hours, and typically search.cpan.org takes a copy once a day.

barbie avatar Apr 11 '11 21:04 barbie

Cool thanks! That should be enough for our purposes!

Am 11.04.2011 um 23:36 schrieb barbie:

search.cpan.org gets them from the cpanstats SQLite database available from the development site: http://devel.cpantesters.org. The DB is updated every 6 hours, and typically search.cpan.org takes a copy once a day.

Reply to this email directly or view it on GitHub: https://github.com/CPAN-API/search-metacpan-org/issues/27#comment_986265

monken avatar Apr 12 '11 14:04 monken

Barbie, does the SQLite database get us to the same place? If the JSON feed has more interesting info, I don't want to exclude that possibility. Having said that, I also don't want to create extra work for you. :) What do you think?

oalders avatar Apr 12 '11 15:04 oalders

Yes, although it means slightly more work your side. I have the JSON summary for the author working, but the distro and totals (as requested by Ranguard) summaries didn't contain enough info, so I'm having to rework the Generator to establish and update that info first. The JSON will still be useful for snapshots, but the SQLite route is probably better if you're expecting to process 100s of queries a minute.

barbie avatar Apr 12 '11 15:04 barbie