Add support for installing packages from authenticated repositories, with `keyring` support
We're expecting to add an authenticated repositories feature to Posit Package Manager soon. Because we're hoping to support both Python and R, we're limited by the fact that pip and friends only support HTTP Basic Auth. So we know there won't be e.g. complex OAuth flows involved.
Now, it's technically possible to install packages from repos that use basic authentication today, e.g. with
options(repos = c(CRAN = "https://username:[email protected]/cran/latest"))
but I'm really not a fan of this approach because I think it will lead users to copy around and embed plaintext credentials in configuration files. We've also found that it can cause issues on Windows if the password is too long -- which it will be in the case of a JWT.
Instead, I think we should follow the example of tools like pip and uv here, too, by checking the system keyring for passwords automatically.
(renv also has a very flexible mechanism for configuring auth headers that could be wired up to the system keyring, too.)
For example, when pak is installing packages from a repository URL like https://[email protected]/cran/latest, we could automatically check whether there is a corresponding "password" (or more likely, a token or API key or some kind) in the system keyring and use that to construct an Authorization header:
repo_auth <- function(repo_url) {
if (!is_installed("keyring")) {
return(NULL)
}
# Pull the username out of the repo URL.
parsed <- httr2::url_parse(repo_url)
username <- parsed$username
if (is.null(username)) {
return(NULL)
}
# Reconsitute the repo URL without the username.
parsed$username <- NULL
base_url <- httr2::url_build(parsed)
tryCatch(
{
pwd <- keyring::key_get(base_url, username)
auth <- paste(username, pwd, sep = ":")
c("Authorization" = paste("Basic", openssl::base64_encode(auth)))
},
error = function(e) NULL
)
}
the above should work with renv, too, via options(renv.download.headers = repo_auth).
To help users out, we could also have a utility function to set this "password". I'm imagining an API something like the following:
pak::repo_set_auth("https://[email protected]/cran/latest")
This would call the equivalent of
keyring::key_set(
"https://ppm.internal/cran/latest",
"username",
prompt = "Password, Token, or API Key: "
)
under the hood.
A more advanced implementation might also prompt the user for credentials when a repo returns a HTTP 401 in an interactive session, and offer to save them in the system keyring. (This is impossible with install.packages() because that function swallows 401 responses.)
Note: one could test basic auth support in pak by running a local NGINX with basic auth enabled proxying to https://p3m.dev.
I'm happy to help out with this, but I need some pointers on where the relevant changes would need to be made.
My thoughts about this, initially a brain dump, to be edited and extended, here or elsewhere.
Must have
- We need a way to store credentials that works with renv, Python, etc. as well, preferably OOTB.
- Credentials must be specific to host names and/or URLs. (We can probably do the same as git here.)
- We need to handle credentials in non-interactive sessions, preferably better than pip does. We could run keyring in a subprocess, with a timeout, or (much better) we could introduce a timeout into the keyring package, if possible. If we use a subprocess, then we'll probably need to cache credentials in env vars.
- We need to keep pak self-contained, i.e. embed the keyring package or the oskeyring package into pak, probably. Also, preferably have a static pak build on Linux that is able to use the secret service API via dbus.
- We need a config option to add arbitrary headers to pak HTTP queries.
- Caching. We probably need to cache credentials in (host or URL specific) env vars, like the gitcreds packages does.
Nice to have
- Better credential storage on server Linux, where there is effectively no system credential store.
- Pluggable auth.
- netrc file support, like pip.
- We could probably use the same credential format that the git credential helpers use.
UI for users
I am not a big fan of the renv UI where the user needs to set an option to a function for a couple of reasons:
- You need to edit your profile(s) to set options, and the profile does not run in
--vanillasessions, so there is no auth there. - The function might refer to packages (e.g. keyring), but that's problematic if keyring is the package that pak/renv is installing, especially on Windows.
- It does not work well with pak doing things in a subprocess, because pak would need to copy the function to another process, and that's always error-prone.
I like the idea of having utility functions to get/set credentials:
pak::repo_auth("https://ppm.internal/cran/latest")
pak::repo_set_auth("https://ppm.internal/cran/latest")
I would not store the username in the repo URL, or at least that should not be required. If the username is not there, then the admin can configure repo URLs for all users easier. Again, we could follow what git does here when it looks up credentials (including usernames) from the credential store.
Issues with the keyring and oskeyring packages
We need to solve these at some point, some seem urgent, some not.
- keyring has a lot of dependencies that we would need to get rid of before embedding it into pak. Or embedding a simplified version should be also possible.
- oskeyring has no dependencies, but it is also not pluggable.
- oskeyring does not support the secret service API on Linux, AFAIR. Could be easily added, though.
- On macOS keyring and oskeyring both use a deprecated macOS API. (https://github.com/r-lib/keyring/issues/160)
Implementation
A significant part of this should go into the pkgcache package. pkgcache handles all HTTP for downloading metadata and packages. So I think the first step would be to implement everything in pkgcache, and then solve the issues with embedding the new pkgcache into pak.
The HTTP client functions in https://github.com/r-lib/pkgcache/blob/main/R/async-http.R have a headers argument, that's where the additional headers need to be passed in.