cranlogs icon indicating copy to clipboard operation
cranlogs copied to clipboard

Distinguish NA and 0

Open hadley opened this issue 10 years ago • 5 comments

I have no idea how hard this would be, but it would be nice to distinguish "package was not on CRAN" (e.g. NA) from "package was not downloaded" (e.g. 0)

hadley avatar May 07 '15 15:05 hadley

Yeah, I was thinking about this, but then did not do it, because it is somewhat hard.

Whether the package was on CRAN is in my other DB about packages, in JSON. So it is not impossible, just need to add another table to cranlogs DB about package availability (just the first submission date and time, essentially), and update it from crandb, periodically.

The thing is, I am running so many small services and updates now, that I need to create some check and notification system, so that I can be sure that everything is working properly, and I am focusing on this (and improving www.r-pkg.org) right now.

So I am a little reluctant to add more updater scripts before this dashboard is up.

But soonish.

gaborcsardi avatar May 07 '15 15:05 gaborcsardi

Yeah, and if you really want to be thorough, you'd also want NAs if the package was temporarily archived. This isn't a big deal for me, just a nice-to-have.

hadley avatar May 07 '15 15:05 hadley

As for temporarily archived packages, you can still download them from the archives, downloads of old package versions are actually counted currently.

gaborcsardi avatar May 07 '15 15:05 gaborcsardi

Oh hmmm, I didn't think about that. In that case, it would also be nice to expose some information about the package versions being downloaded

hadley avatar May 07 '15 18:05 hadley

Yeah, first I would need to put it in the DB. :)

When I first started, I wanted to put everything in a DB, and have a rich API. But then it turned out that that would require a much bigger machine, in terms of disk, memory and cpu.

If the daily download log is about 15MB, then yearly I need ~5GB in the DB, and that is not huge, but it is not something for tiny digitalocean instance, especially if the download numbers are growing fast.

Of course something in the middle is also possible, I mean between the current simple DB, and including all data.

But do downloads of old versions matter much, anyway? Given R's dependency handling, they should not happen very often.

gaborcsardi avatar May 07 '15 19:05 gaborcsardi