miniCRAN icon indicating copy to clipboard operation
miniCRAN copied to clipboard

Implement the archive

Open andrie opened this issue 5 years ago • 7 comments

If a new version of a package gets added to the repo, move the old version to the archive

andrie avatar Jun 29 '19 14:06 andrie

Based on CRAN, the directory structure of the archive should look like /src/contrib/Archive/pkgName/pkgName_x.y.z.tar.gz. So each package gets a subdirectory for itself in Archive, and within a package subdir exists the source tarballs of previous versions.

Implementation considerations:

  1. When adding a package to the miniCRAN repo, should any/all old versions be added?
  2. addOldPackage currently it puts an old version of a downloaded package into the "current" version in the miniCRAN repo. Once an archive is implemented, this doesn't seem like the correct behaviour.

There are probably other points that need to be resolved before going ahead with implementation.

achubaty avatar Dec 17 '19 16:12 achubaty

One point worth of notice is that, all it boils down to how devtools actually looks for archived information. I checked the source code and apparently it's not only about creating an Archive/package/package_version.tar.gz. There's also a directory "Meta" that contains the following files:

aliases.rds
archive.rds
current.rds
rdxrefs.rds

I am using a very old version of devtools to be fair, but the problem seems to be here

https://rdrr.io/cran/remotes/src/R/install-version.R

package_find_repo <- function(package, repos) {
  for (repo in repos) {
    if (length(repos) > 1)
      message("Trying ", repo)

    archive <-
      tryCatch({
        con <- gzcon(url(sprintf("%s/src/contrib/Meta/archive.rds", repo), "rb"))
        on.exit(close(con))
        readRDS(con)
      },
      warning = function(e) list(),
      error = function(e) list())

    info <- archive[[package]]
    if (!is.null(info)) {
      info$repo <- repo
      return(info)
    }
  }

  stop(sprintf("couldn't find package '%s'", package))
}

So what devtools is doing (at least the version I currently have, 1.13.6, which is ancient) is to look for archive.rds and use it as a source of metainfo. Then, the calling routine (devtools::install_version) does the Archive/ dance to retrieve the package.

stefanoborini avatar Dec 17 '19 16:12 stefanoborini

Thanks for that @stefanoborini. I'll also add that the CRAN servers appear to be using Archive as a symlink(?) to 00Archive:

image

achubaty avatar Dec 17 '19 16:12 achubaty

I guess it's just a trick to have it as first entry to ease finding it.

stefanoborini avatar Dec 17 '19 16:12 stefanoborini

It makes sense to grab the metadata file to avoid recursing the contents of the directories to find packages. So that will be an additional detail to consider.

Yes, 00Archive is likely done to make it the first directory in /src/contrib/ so humans can find it easily.

achubaty avatar Dec 17 '19 16:12 achubaty

Note that the current version of devtools has moved the above code to the package remotes. The code is unchanged.

stefanoborini avatar Dec 17 '19 16:12 stefanoborini

This is something we do after running miniCRAN::addLocalPackage(basename(pkg), "..", "/var/www/minicran")

          pkgs <- readRDS("/var/www/minicran/src/contrib/PACKAGES.rds")
          pkgs <- paste0(pkgs[,1], "_", pkgs[,2], ".tar.gz")
          files <- dir(path = "/var/www/minicran/src/contrib", pattern = ".tar.gz")
          archives <- files[!files %in% pkgs]
          dir.create("/var/www/minicran/src/contrib/Archive/Meta", showWarnings = FALSE, recursive = TRUE)
          file.rename(paste0("/var/www/minicran/src/contrib/", archives),
                      paste0("/var/www/minicran/src/contrib/Archive/", archives))
          # Move archives in individual folders so renv can restore older versions
          f <- list.files("/var/www/minicran/src/contrib/Archive", pattern="tar.gz", include.dirs = FALSE)
          dir <- sort(unique(vapply(strsplit(f, "_"),`[`, character(1), 1)))
          for (d in dir) {
            dir.create(paste("/var/www/minicran/src/contrib/Archive",d,sep="/"), showWarnings=FALSE)
            files <- f[grepl(paste0("^",d,"_"),f)]
            file.rename(paste("/var/www/minicran/src/contrib/Archive",files,sep="/"), paste("/var/www/minicran/src/contrib/Archive",d,files,sep="/"))
          }
          # Create or update archive.rds
          wd <- getwd()
          setwd("/var/www/minicran/src/contrib/Archive")
          archive <- lapply(setNames(nm = list.files(".")), function(x) {file.info(list.files(x, recursive = T, full.names = TRUE))})
          saveRDS(archive, "../Meta/archive.rds")
          setwd(wd)

In case it helps someone dealing with an automated process using minicran.

meztez avatar Dec 15 '22 19:12 meztez