cargo icon indicating copy to clipboard operation
cargo copied to clipboard

Implement RFC 3139: alternative registry authentication support

Open arlosi opened this issue 2 years ago • 24 comments

Allows registries to request Cargo to send the authentication token for all requests, rather than just publish/yank, implementing RFC 3139.

Items from the tracking issue

Do registries need a more fine-grained switch for which API commands require authentication?

This PR uses the auth_required boolean as described in the RFC.

The RFC mentions adding --token to additional commands like install and search

These flags are not added by this PR.

Consider changing the name and form of the X- header

Changed to the www-authenticate header as suggested by the comments.

Will there be any concerns with the interaction with https://github.com/rust-lang/rfcs/pull/3231

Not that I know of.


Adds a new field "auth-required": true to config.json that indicates Cargo should include the token in all requests to a registry.

For HTTP registries, Cargo first attempts an un-authenticated request, then if that fails with HTTP 401, an authenticated request is attempted. The registry server may include a www-authenticate header with the HTTP 401 to instruct Cargo with URL the user can visit to acquire a token (crates.io/me).

Since the API URL is not known (because it's stored in the index), the unstable credential provider feature is modified to key off the index url, and the registry name is no longer provided.

To handle the case where an alternative registry's name is not known (such as coming from a lock file, or via --index), Cargo can now look up the token in the configuration by matching on the index URL. This introduces a new error if two alternative registries are configured with the same index URL.

Several operations, such as cargo install could have had a --token argument added, however it appears that Cargo would like to move away from passing the token on the command line for security reasons. In this case, users would need to configure the registry via the config file (or environment variables) when using cargo install --index ... or similar.

arlosi avatar Apr 22 '22 17:04 arlosi

r? @ehuss

(rust-highfive has picked a reviewer for you, use r? to override)

rust-highfive avatar Apr 22 '22 17:04 rust-highfive

@Eh2406 I updated this PR to address your comments. I've also cleaned up the test framework to combine the two test http servers (index and api) into one, and added additional tests.

arlosi avatar May 11 '22 20:05 arlosi

This is looking good.

Some points for discussion at the next available Cargo meeting:

  1. Since the API URL is not known (because it's stored in the index), the unstable credential provider feature is modified to key off the index url, and the registry name is no longer provided.
  2. It may be better to just remove CARGO_REGISTRY_NAME_OPT
  3. This introduces a new error if two alternative registries are configured with the same index URL.
  4. With this change, cargo always reads the credentials file when it loads the other configuration.
  5. New dep on https://github.com/scottlamb/http-auth with default=false

Thanks for all the hard work!

Eh2406 avatar May 16 '22 22:05 Eh2406

We were able to discuss this PR briefly at yesterday's meeting. I am trying to recall what we were able to discuss and what we weren't, and try and assign people to follow up.

We did not particularly talk about:

Since the API URL is not known (because it's stored in the index), the unstable credential provider feature is modified to key off the index url, and the registry name is no longer provided.

It may be better to just remove CARGO_REGISTRY_NAME_OPT

Some more context: This is an environment variable that is provided to a credential provider when a registry name is known. Given that it's never technically necessary, and not consistently available, should we just remove it?

This introduces a new error if two alternative registries are configured with the same index URL.

Some more context: The RFC authorized us to make this an error. This PR implements the error eagerly. This has the advantage that the moment you corrupt your configuration you will get an error message. And the implementation is pretty straightforward, load all configuration and look for conflicts. It has the disadvantage that ms-configuring registry foo will prevent you from building a project that depends only on crates.io.

@rust-lang/cargo any thoughts on the above questions?

We were able to briefly discuss:

With this change, cargo always reads the credentials file when it loads the other configuration.

Some more context: This is a consequence of the eager evaluation above. We need to load all registry configurations in order to figure out if any of them conflict, to fully load a registry we need to look at credentials to see if there's a token associated with it.

In the meeting: We would like to not read credentials if we don't have to, but are not 100% committed to it. we were wondering what the alternative designs would be here, and what their disadvantages would be? Could we eagerly load config's but lazily load credentials? @arlosi what are your thoughts?

New dep on https://github.com/scottlamb/http-auth with default=false

Some more context: Cargo needs some way for a server to tell it how a user can get a token. It would be nice to follow the www-authenticate standard, but it is a bit of a mess. The PR originally had 200 lines of parsing code, but I was not confident in this being completely correct. We found a dependency that does this parsing and switch to using that.

In the meeting: We would like to do a little more research on alternatives to this dependency. One alternative is to parse ourselves using one of the libraries we already use. (@epage you got the most experience with parsing, does this make any sense?) Alternatively we would like to shop around to see how this crate compares to other crates for this purpose. Specifically, What crates are there? And for each one:

  • How actively maintained as it / does the code look up to snuff?
  • Is it widely used? Reverse dependencies / download counts.
  • Is the API plausible for our use?
  • Is the license reasonable for our use?
  • Are the dependencies reasonable?

We may decide that the one I picked originally is correct, but someone should document the due diligence. @arlosi what are your thoughts, would you be up to collecting this data? (If not I can redo the work, and this time document it.)

Eh2406 avatar May 18 '22 19:05 Eh2406

Could we eagerly load config's but lazily load credentials?

Cargo currently works by eagerly loading credentials for operations that are known to require them (publish, etc). The credentials are then merged into the Config.

This change makes it so the credentials may be required for pretty much anything hitting the network. If we wanted to continue to lazy-load the credentials, we'd probably want to do it in auth::registry_credential_config.

Unfortunately, the Config is not mutable, which makes it difficult to merge additional values into it at that point.

Should we use the http-auth crate to parse the www-authenticate header?

  • License is OK (Apache 2, or MIT)
  • Reverse dependencies are minimal (only 3 small crate)
  • Download count is small (3k)
  • Code seems actively maintained and good quality
  • Has no new transitive dependencies (with default-features = false)
  • We're only using a subset of crate functionality. It can do other things such as respond to Basic and Digest auth (if those features are enabled).

If we're going to use an existing external crate, http-auth seems like the best one. Other crates considered:

  • www-authenticate has unsafe mem::transmute and has two new transitive deps (hyperx, unicase). It could work, but we'd need to understand the mem::transmutes.
  • authentic uses http-auth
  • actix-web-httpauth has dependencies on actix
  • http-auth-basic supports only "Basic" auth

Another alternative is the parser I wrote within cargo.

  • Less well tested
  • Avoids a dependency
  • Makes best effort to parse
  • 69 lines of code (excluding tests)

This PR currently uses the http-auth crate.

arlosi avatar May 18 '22 21:05 arlosi

Thanks so much for working on this, it looks great! I particularly like the updates to the tests.

Some notes:

  • I think it would be good to keep the CARGO_REGISTRY_NAME_OPT option, and to use that to store as an informational element in the key store. That way, if a user looks at the key store they can see something that has a little more meaning than just a URL.
  • It looks like the 1password credential program is broken. It looks like the search code isn't working correctly, as it is looking at the title, but this was changed to remove the title. I think it needs some kind of title. I'm not sure if the search code can search by URL. Can you make sure that all of the credential programs continue to work?
  • -Z registry_auth needs to be documented in unstable.md.
  • Can you make sure to update the version of crates-io to 0.35.0, and cargo-credential to 0.2.0?
  • I would be much more comfortable if credentials were loaded only as needed. I imagine there are several ways that could be done, but one idea is to have two values tables (the normal Config.values and a new one for credentials). The get_cv function could then consult the credentials one first, and then the normal one. In the common case where the credentials aren't loaded, I don't think it should impact performance noticeably. I'm not sure I fully follow what Config.credential_cache is currently doing, but perhaps it could use that?

ehuss avatar May 23 '22 20:05 ehuss

:umbrella: The latest upstream changes (presumably #10698) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar May 24 '22 16:05 bors

Thanks for all the feedback. I'll get started on figuring out how to lazy load the credentials file.

For 1password:

  • The op tool requires a paid 1password account, which I don't have.
  • They have recently released a v2 of the op tool. Do we want to switch to that? I can't find documentation for the v1 protocol. https://developer.1password.com/docs/cli/reference/management-commands/item

I can test the other credential providers.

arlosi avatar May 25 '22 20:05 arlosi

Hm, I didn't know there was a new version of the 1password client. It would probably be good to update it, but that can probably be done separately.

The key issue is that there needs to be a way to uniquely find the credential. The search function currently filters based on the title which was the registry name. I think perhaps that can be changed to search for the url, something like:

diff --git a/crates/credential/cargo-credential-1password/src/main.rs b/crates/credential/cargo-credential-1password/src/main.rs
index 86fc9fed8..bd713fd0e 100644
--- a/crates/credential/cargo-credential-1password/src/main.rs
+++ b/crates/credential/cargo-credential-1password/src/main.rs
@@ -41,7 +41,7 @@ struct ListItem {

 #[derive(Deserialize)]
 struct Overview {
-    title: String,
+    url: String,
 }

 impl OnePasswordKeychain {
@@ -175,11 +175,7 @@ impl OnePasswordKeychain {
         Ok(buffer)
     }

-    fn search(
-        &self,
-        session: &Option<String>,
-        registry_name: &str,
-    ) -> Result<Option<String>, Error> {
+    fn search(&self, session: &Option<String>, index_url: &str) -> Result<Option<String>, Error> {
         let cmd = self.make_cmd(
             session,
             &[
@@ -196,15 +192,15 @@ impl OnePasswordKeychain {
             .map_err(|e| format!("failed to deserialize JSON from 1password list: {}", e))?;
         let mut matches = items
             .into_iter()
-            .filter(|item| item.overview.title == registry_name);
+            .filter(|item| item.overview.url == index_url);
         match matches.next() {
             Some(login) => {
                 // Should this maybe just sort on `updatedAt` and return the newest one?
                 if matches.next().is_some() {
                     return Err(format!(
-                        "too many 1password logins match registry name {}, \
+                        "too many 1password logins match registry URL {}, \
                         consider deleting the excess entries",
-                        registry_name
+                        index_url
                     )
                     .into());
                 }
@@ -232,6 +228,8 @@ impl OnePasswordKeychain {
                 "Login",
                 &format!("password={}", token),
                 &format!("url={}", index_url),
+                "--title",
+                "Cargo registry token",
                 "--tags",
                 CARGO_TAG,
             ],

ehuss avatar May 27 '22 19:05 ehuss

Looking good to me.

Eh2406 avatar Jun 01 '22 16:06 Eh2406

@ehuss

I think it would be good to keep the CARGO_REGISTRY_NAME_OPT option

I've added it back, and used it in the credential process crates

It looks like the 1password credential program is broken

I implemented your suggestion to fix it, but I don't have a 1password account to test it. I've manually verified the wincred and gnome-secret providers.

-Z registry_auth needs to be documented in unstable.md.

Done.

Can you make sure to update the version of crates-io to 0.35.0, and cargo-credential to 0.2.0?

Yes, good idea. I updated the versions of all the credential providers as well.

I would be much more comfortable if credentials were loaded only as needed.

I implemented this as you suggested by having an additional LazyCell on the Config that allows the credentials to be loaded only when needed and re-added the test.

arlosi avatar Jun 01 '22 17:06 arlosi

:umbrella: The latest upstream changes (presumably #10738) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Jun 11 '22 01:06 bors

:umbrella: The latest upstream changes (presumably #10753) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Jun 15 '22 00:06 bors

:umbrella: The latest upstream changes (presumably #10764) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Jun 17 '22 17:06 bors

@ehuss I'd appreciate your review when you have a chance. It should be easier now that the test code has already been merged.

Eh2406 avatar Jun 17 '22 22:06 Eh2406

It seems like this PR is falling behind. My organization requires this functionality for its registry and I have been trying to sell them on Rust for a while.

Is there anything I can do to help this move along?

AngusWaller avatar Jul 27 '22 05:07 AngusWaller

It's not abandoned. I've been working on writing and implementing RFC 3289 to clean up how source replacement works around authentication. Will be getting back to this soon.

arlosi avatar Jul 27 '22 15:07 arlosi

I updated this PR to be based on top of #10907 as two commits.

arlosi avatar Jul 27 '22 20:07 arlosi

Oh awesome news! I hope it gets in soon

AngusWaller avatar Jul 28 '22 00:07 AngusWaller

:umbrella: The latest upstream changes (presumably #10977) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Aug 12 '22 02:08 bors

I noticed when experimenting with this that sometimes outgoing requests redact the authenticate header, and other requests do not. We should probably consistently redact the header.

Eh2406 avatar Aug 25 '22 16:08 Eh2406

Do you mean the Authorization header? What do you mean by "redact the header"? Are you talking about log messages?

arlosi avatar Aug 25 '22 17:08 arlosi

Sorry. I should have spent more time drafting that message. Yes I was talking about the Authorization header in log messages. When running cargo/target/release/cargo publish --registry=myregistry -Z sparse-registry -Z registry-auth --allow-dirty with export CARGO_HTTP_DEBUG=true and CARGO_LOG=cargo::ops::registry=trace. I see that the second request for cargo-test/index/config.json has a line DEBUG cargo::ops::registry] http-debug: > authorization: Bearer not-yet-set, which should not be logging the authorization. For comparison, the request to PUT cargo-test/api/v1/crates/new logs DEBUG cargo::ops::registry] http-debug: > Authorization: [REDACTED]

Eh2406 avatar Aug 25 '22 17:08 Eh2406

:umbrella: The latest upstream changes (presumably #11068) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Sep 09 '22 22:09 bors

hello, and thanks for your hard work on this RFC/PR - very excited about this feature!

In testing how this will work, I came across a small sticking point: the current options for passing an auth token via env var or command line flag are quite limited and have some usability issues.

My interest in this is for the purpose of auto-generating cargo doc output for crate versions published at a private registry that requires authentication, and wanting to avoid storing the registry auth token in a file (i.e. ~/.cargo/credentials) in the build environment.

I found the CARGO_REGISTRIES_<name>_TOKEN env var, and was able to use it. However, it took some effort to discover how to format the <name> portion of the env var. In my case, I was testing using a private registry configured under the name "shipyard-rs". The hyphen/dash in the name made things difficult. My first attempt was CARGO_REGISTRIES_shipyard-rs_TOKEN=xxx and that did not work, as variable names with dashes don't work in shell. I tried using the env program (env "CARGO_REGISTRIES_shipyard-rs_TOKEN=xxx" cargo ... before realizing that formatting the registry <name> in screaming snake case would be converted and matched (i.e. CARGO_REGISTRIES_SHIPYARD_RS_TOKEN=xxx works).

I also tried:

  • cargo fetch -Z registry-auth --token xxx - failed with argument '--token' wasn't expected
  • cargo fetch -Z registry-auth --config 'registries.shipyard-rs.token=xxx' - failed with error: registries.shipyard-rs.token cannot be set through --config for security reasons (is there some other way to pass this which would not trigger this error?)

cargo login is great for terminal usage, but there are other contexts where it will be important to be able to pass the auth token to cargo either via env var or command line flag such as containers, CI, sandboxed build environments, etc.

Perhaps I am missing some method which would make this easier. If not, it might be worth discussing if there are any options which could make this less tricky for future users to solve. Thoughts?

jonathanstrong avatar Sep 22 '22 02:09 jonathanstrong

@jonathanstrong thank you very much for trying this out! The registry token field is already stable, this PR only makes it more useful. I'm worried that your feedback will get lost in this PR, can you move the conversation to a separate issue. In that issue could you elaborate on what would work well for you? At a minimum it sounds like we need improvements to the documentation for setting environment variables.

Eh2406 avatar Sep 22 '22 15:09 Eh2406

:umbrella: The latest upstream changes (presumably #11205) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Oct 12 '22 21:10 bors

:umbrella: The latest upstream changes (presumably #11307) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Oct 28 '22 02:10 bors

:umbrella: The latest upstream changes (presumably #11285) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Nov 02 '22 13:11 bors

:umbrella: The latest upstream changes (presumably #11369) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Nov 14 '22 14:11 bors