miniCRAN
miniCRAN copied to clipboard
Implement the archive
If a new version of a package gets added to the repo, move the old version to the archive
Based on CRAN, the directory structure of the archive should look like /src/contrib/Archive/pkgName/pkgName_x.y.z.tar.gz
. So each package gets a subdirectory for itself in Archive
, and within a package subdir exists the source tarballs of previous versions.
Implementation considerations:
- When adding a package to the miniCRAN repo, should any/all old versions be added?
-
addOldPackage
currently it puts an old version of a downloaded package into the "current" version in the miniCRAN repo. Once an archive is implemented, this doesn't seem like the correct behaviour.
There are probably other points that need to be resolved before going ahead with implementation.
One point worth of notice is that, all it boils down to how devtools actually looks for archived information. I checked the source code and apparently it's not only about creating an Archive/package/package_version.tar.gz. There's also a directory "Meta" that contains the following files:
aliases.rds
archive.rds
current.rds
rdxrefs.rds
I am using a very old version of devtools to be fair, but the problem seems to be here
https://rdrr.io/cran/remotes/src/R/install-version.R
package_find_repo <- function(package, repos) {
for (repo in repos) {
if (length(repos) > 1)
message("Trying ", repo)
archive <-
tryCatch({
con <- gzcon(url(sprintf("%s/src/contrib/Meta/archive.rds", repo), "rb"))
on.exit(close(con))
readRDS(con)
},
warning = function(e) list(),
error = function(e) list())
info <- archive[[package]]
if (!is.null(info)) {
info$repo <- repo
return(info)
}
}
stop(sprintf("couldn't find package '%s'", package))
}
So what devtools is doing (at least the version I currently have, 1.13.6, which is ancient) is to look for archive.rds and use it as a source of metainfo. Then, the calling routine (devtools::install_version) does the Archive/ dance to retrieve the package.
Thanks for that @stefanoborini. I'll also add that the CRAN servers appear to be using Archive
as a symlink(?) to 00Archive
:
I guess it's just a trick to have it as first entry to ease finding it.
It makes sense to grab the metadata file to avoid recursing the contents of the directories to find packages. So that will be an additional detail to consider.
Yes, 00Archive is likely done to make it the first directory in /src/contrib/
so humans can find it easily.
Note that the current version of devtools has moved the above code to the package remotes
. The code is unchanged.
This is something we do after running miniCRAN::addLocalPackage(basename(pkg), "..", "/var/www/minicran")
pkgs <- readRDS("/var/www/minicran/src/contrib/PACKAGES.rds")
pkgs <- paste0(pkgs[,1], "_", pkgs[,2], ".tar.gz")
files <- dir(path = "/var/www/minicran/src/contrib", pattern = ".tar.gz")
archives <- files[!files %in% pkgs]
dir.create("/var/www/minicran/src/contrib/Archive/Meta", showWarnings = FALSE, recursive = TRUE)
file.rename(paste0("/var/www/minicran/src/contrib/", archives),
paste0("/var/www/minicran/src/contrib/Archive/", archives))
# Move archives in individual folders so renv can restore older versions
f <- list.files("/var/www/minicran/src/contrib/Archive", pattern="tar.gz", include.dirs = FALSE)
dir <- sort(unique(vapply(strsplit(f, "_"),`[`, character(1), 1)))
for (d in dir) {
dir.create(paste("/var/www/minicran/src/contrib/Archive",d,sep="/"), showWarnings=FALSE)
files <- f[grepl(paste0("^",d,"_"),f)]
file.rename(paste("/var/www/minicran/src/contrib/Archive",files,sep="/"), paste("/var/www/minicran/src/contrib/Archive",d,files,sep="/"))
}
# Create or update archive.rds
wd <- getwd()
setwd("/var/www/minicran/src/contrib/Archive")
archive <- lapply(setNames(nm = list.files(".")), function(x) {file.info(list.files(x, recursive = T, full.names = TRUE))})
saveRDS(archive, "../Meta/archive.rds")
setwd(wd)
In case it helps someone dealing with an automated process using minicran.