webauthn
webauthn copied to clipboard
Proposal for password-only authentication using ES256
I would like to propose there being a standardized format for password-only (that is, effectively single-factor authentication) keys using WebAuthn.
I understand there may be some reluctance among the WebAuthn community to such low-security uses of WebAuthn, so to make my case for it, I'd like to assert that there are, for the foreseeable future, cases where one might want to use password-only authentication instead of implementation-bound keys for a variety of reasons (I think this assertion should be uncontroversial enough to not have to bring up examples), and in such cases, strictly if you are going to use a password anyway:
- It is better to use a challenge-response-based protocol where the password never has to leave the user-agent; and
- Password authentication would stand to benefit a lot from the strong anti-phishing protections that WebAuthn offers.
My proposal, such as it is, then, wouldn't extend the WebAuthn protocol as such, but rather offer a standardized format for such keys that can be used across independent implementations, and which wouldn't require any changes on the part of RPs. The basic idea is to run the password through some sort of PRF with 32 bytes of output, and derive an ES256 key from said output, and to identify keys thus generated with a standardized format for the credential ID, which would also specify parameters for the PRF.
I propose a credential-ID format consisting of the following parts, in order:
- The first 8 bytes are a checksum of the rest of the ID data. It is calculated by calculating the SHA-256 hash of the ASCII string "
pwkey", concatenated with the rest of the credential-ID data, and taking the first 8 bytes thereof. As far as I can tell, the checksum shouldn't be security-critical, so it's mostly just a tag for recognizing a password-based key. - A NUL-terminated string naming the PRF to be used.
- Any parameters for the PRF, in a format specific to each PRF.
To start out with, I only propose one PRF, its name being "pbkdf2-sha256". Obviously, this would be using PBKDF2 with the SNA-256 hash function, except that to preserve the anti-phishing strength, the RPID needs to be combined with the password. The parameters for this PRF would consist of the the following parts, in order:
- A single unsigned byte, n, specifying the 2log of the iteration count, and
- Any number of bytes (32 being the suggested number), salt, to be used for the PBKDF2 salt.
To incorporate the RPID, the actual password used as input for PBKDF2 should be the concatenation of the RPID hash with the password as input by the user (the hash is used instead of the text RPID to avoid any potential extension attacks). Thus, the output of this PRF, given a password to hash, would be PBKDF2(SHA-256, SHA-256(RPID) | password, salt, 1 << n, 32). As an example, then, a pbkdf2-sha256 credential might be, in hex-encoded form:
92F9795026B3B7A570626B6466322D7368613235360014ED3035C6AB2A060BF3C73D96678084F5FDDD0C419B5D89415ABE8D2E1BE273F9
When used with the password "123" on RP "example.com*, it would produce the following output:
75E7E2F671172E740D45BA1CFC7B316C58878444D0EA151A6B5888CCF896AB51
In order to generate an ES256 key from the PRF output, then, as far as my knowledge of ECDSA goes, it should be fine to more or less use the PRF output as it is as the d value for a P-256 key (the x and y coordinates would of course just be calculated from the d value), the only caveat being that its numerical value cannot be greater than the generator order of the curve (the n value in standardized curve parameters). As how to deal with this limitation, that is where the limits of my knowledge of elliptic curve mathematics are hit:
- The obvious proposition would be to simply use the remainder of the PRF output with the n number of the curve. However, a small subset of PRF outputs (those that overflow n) would then result in a low numerical value and low Hamming weight for d, which I don't know if it should be considered a problem. If it is, then;
- The slightly more complex proposition would be to take a PRF-output number that overflows n and simply rotate the bits left until it no longer does.
If anyone is interested in this proposal, I would appreciate input on which method to use, but my inclination (not knowing better) would be to go the perhaps somewhat safer route and use the latter.
Finally, as for the various metadata produced for the WebAuthn protocol:
- Decent user-agents should probably set the UV bit if the password was freshly input, and clear it if a saved password was used.
- The
rkproperty should not be set. - The
BSandBEbits should probably reflect whether the password is stored in a password-manager with backup capabilities, though I don't know why anyone would use these keys in favor of properly randomly generated keys with a WebAuthn-aware password-manager.
I think that should be all that is relevant for the proposal.
Is anyone interested in this, and if so, any thoughts?
Not sure how this really offers protection against phishing. If a bad actor can present a phishing site with a dialog into which a user supplies their password, then the bad actor can use that password against the legitimate site later.
If a bad actor can present a phishing site with a dialog into which a user supplies their password
In my experience, most current WebAuthn user-agent implementations use a dialog that a site can't really spoof since it's partly outside the page viewport.
Sure, I guess a naïve user could be fooled into entering their password into a non-standard dialog, but having that distinction I would think makes a big difference, at least.
Interesting idea, but like @sbweeden I'm skeptical of this.
First, some surface-level thoughts:
hash_to_field(RFC 9380) can be used to derive an EC private key with minimal distribution bias.- I'd say the
UVshould probably be set to 0 most, or maybe all, of the time. WebAuthn keys are normally assumed to be primarily possession factors, and the UV bit represents use of also a second factor. But a password-derived key inherently is primarily a knowledge factor, so setting the UV bit would misrepresent these credentials as multi-factor credentials when they are in fact always single-factor credentials. - Similarly, these credentials should always set
BE=1because a password-derived credential is always trivially exportable, no matter how you store it.
On a deeper level, I'm not sure I fully understand the user interaction model here. Is the expectation that (1) the user would enter the password during each registration/authentication ceremony? Or (2) does the user import the password once (possibly in a browser settings UI) and then only authorize its use as (key derivation material for) a credential?
I agree with @sbweeden that (1) seems too dangerous because it would be fairly easy to trick many users into entering that password into a malicious site, which could then of course follow the spec to perform the key derivation and impersonate the user. Yes, dedicated browser UI would make this detectable for the most observant users, but this would likely succeed against most people as they may not know to look out for this.
(2) would be a bit less dangerous, but still open for abuse as malicious actors can still ask victims to enter their master password, or direct them on how to open their browser settings to retrieve the master password and enter it on the site. This would probably have a far lower success rate than (1), but I still think it would work in too many cases.
But whatever the UI, there's an even bigger problem with this: there isn't really any meaningful brute-force protection, and the same password would result in the same credential ID for different users. High iteration counts help, but only for nontrivial passwords and almost not at all against dictionary attacks. Many people would continue to reuse a single password everywhere, perhaps with minor variations, and many would use a very simple one that would be cracked almost immediately by any offline attack. "Salting" the KDF with a username would help a little bit to differentiate hashes between users, but again this is only really meaningful for fairly complex passwords. And once you crack an RPID-username-password combination, you can still easily try that same username-password combination on other RPIDs. Salting with any cryptographically random value would invalidate the point of deriving everything from a password alone.
So any way you twist it, the problem is that there just isn't enough entropy in most human-chosen passwords, so we end up back at still needing a password manager anyway - be it to embed more entropy in the password itself, or to store and manage auxiliary salts or keys to mix in with a low-entropy password. At that point, why wouldn't you just use a cryptographically random passkey stored in that password manager?
Thanks for the thoughtful reply!
the same password would result in the same credential ID for different users
I wonder if you did not misunderstand the credential ID construction; the password is not part of it at all. Rather, most of the credential ID is just a randomly generated salt, so with 32 randomly generated bytes, I would indeed expect all credential IDs to be different.
Since the public key (that the server stores) is only derived from the salt, RPID and password, however, a malicious server could certainly bruteforce the user's password over time, but I believe this is intrinsic to any possible authentication protocol based on passwords alone (if a system can validate a password for login, then of course it can also bruteforce it). I don't think a third party would be able to do that, however.
So any way you twist it, the problem is that there just isn't enough entropy in most human-chosen passwords
Naturally, this is the problem of any authentication scheme involving passwords only. My main point is that there are many situations where it is impractical to use any other factor than a password alone, and specifically in such situations, for the reasons I started out with, WebAuthn would significantly alleviate (even if certainly not eliminate) many of the problems of password-only authentication.
As for hash_to_field, thank you very much for the link. I had tried finding prior art for deriving EC keys from passwords, but was unable to find any. However, while I haven't read and understood RFC 9380 fully yet, it does not necessarily sound as though it generates an ECDSA private key. It's not immediately obvious to me what the output of hash_to_field is when applied to the right parameters, but the abstract talks about generating points on an elliptic curve (and the u outputs being vectors seems to reinforce that, though I'm not sure what m would be in this case), whereas a private ECDSA key is just a scalar. I'll continue reading it, though.
I'm not sure I fully understand the user interaction model here
The expectation would certainly be your first case. At the risk of repeating myself, the whole point of the proposal is for cases where a second factor is impractical. If the password is just to be imported into a browser, then I feel one might just use properly random keys instead.
I wonder if you did not misunderstand the credential ID construction; [...] most of the credential ID is just a randomly generated salt
Ah, thanks, I did indeed miss that part. That certainly does at least limit the scalability of a brute force attack as you'd need to crack each credential ID individually rather than all at once.
I don't think it helps all that much, though: since this salt is stored on the server (in the credential ID), it must be sent to the user before the user can derive their key from it. Since this is intended to be the only authentication step, as I understand the proposal, anyone can query the server for a user's credential IDs without needing to authenticate first - so the attacker doesn't need to compromise the server-side database to get at the credential IDs, they can just request them via the server's public API. From there, any low-entropy password will be easy to crack since the attacker has the salt. Granted, the individual salts means they can't attack all users' credential IDs at once, but they don't really affect the difficulty of a targeted attack.
The expectation would certainly be your first case.
Which is:
(1) the user would enter the password during each registration/authentication ceremony
Ok. Like I said, I think this seems a bit too susceptible to tricking users into entering their password into a malicious webpage rather than the browser UI.
Off-topic hash_to_field discussion
> it does not necessarily sound as though [hash_to_field] generates an ECDSA private key. [...] (and the _u_ outputs being vectors seems to reinforce that, though I'm not sure what _m_ would be in this case)
It is a bit obtuse, but the field to hash to is a free parameter of the function. The primary objective of the RFC is the hash_to_curve function, which uses hash_to_field to hash to the coordinate field of an elliptic curve, but you can also use hash_to_field to hash to the scalar field instead:
The hash_to_field function is also suitable for securely hashing to scalars. For example, when hashing to the scalar field for an elliptic curve (sub)group with prime order r, it suffices to instantiate hash_to_field with target field GF(r).
The u vector just holds count results, but you can set count=1. m is 1 when hashing to a prime order field F (because F has order q = p^m, and q = p for a prime order field). So for hashing to an EC private key you'd set m = 1 and p = n where n is the order of the curve generator (AKA base point) (i.e., the order of the prime order subgroup of the curve (which is the only (or any, depending on how you see it) subgroup of a prime order curve like P-256)).
I think you still misunderstand, as you shouldn't be able to crack a credential ID at all, given that the password has no part in constructing it. You would be able to brute-force a (credential-ID, public-key) pair, but that would at least require the server-side data getting leaked. The credential ID only contains parameters for the KDF, not any results of it.
For example, when hashing to the scalar field for an elliptic curve (sub)group with prime order r, it suffices to instantiate hash_to_field with target field GF(r).
Ah, nice. I hadn't read that part, but in that case, it certainly seems only suitable to reuse it.
Ah, I see now. I think I assumed the MAC would involve the password, but indeed it does not. Here's how I understand the proposal in pseudocode:
password = input()
salt = random()
n = config.n
cred_id_tail = "pbkdf2-sha256" | 0x00 | encode(n, salt)
cred_id = LEFT(8, SHA256("pwkey" | cred_id_tail)) | cred_id_tail
prk = PBKDF2(SHA-256, SHA-256(RPID) | password, salt, 1 << n, 32)
pri_key = hash_to_field(prk, ...)
pub_key = pri_key * P256.G
So ok, the credential ID alone isn't enough to mount an offline brute-force attack since there's nothing to verify attempts against. An online brute-force attack is still possible, but then of course the target server can deploy rate-limiting restrictions or the like to make the attack far less scalable.
You would be able to brute-force a
(credential-ID, public-key)pair, but that would at least require the server-side data getting leaked.
Yeah. You could also brute-force a (credential-ID, signature) pair, as that gives the attacker something to test a public key guess against. That would require the attacker to passively eavesdrop on a registration or login ceremony. As far as I can tell the difficulty of both of those attacks is equivalent to breaching a password database or passively eavesdropping on a password entry. So yeah, this proposed approach seems to not introduce any new vulnerabilities compared to traditional password authentication.
On that note, though: I'm not convinced we want to lower the bar for WebAuthn to just "not worse than a password". At least in my opinion, WebAuthn is supposed to be better than passwords, not just equivalent, and I'm not convinced this approach is enough of an improvement over traditional password authentication.
You wrote in the initial comment that:
- Password authentication would stand to benefit a lot from the strong anti-phishing protections that WebAuthn offers.
But I question if it actually would. The inescapable problem with passwords is that they can be entered into a malicious input field, whereas a WebAuthn credential cannot. The strength of WebAuthn is not that it is possible to use it securely, but that it is very difficult to use it insecurely. Any password-based system inherently struggles to achieve the latter since a password can always be entered anywhere.
I applaud your effort to come up with a fairly good design within the design restrictions, but I don't think WebAuthn is the right home for it. But perhaps some PAKE-based approach could be a candidate for a new credential type in the Credential Management API (which WebAuthn itself is an extension of)?
PAKEs would indeed be better. It doesn't take long looking at past technologies like kerberos to see how simplying using a KDF is open to attacks.
You could also brute-force a
(credential-ID, signature)pair
Yes, I also realized that last night, along with the fact that an attacker could choose the salt for a victim, and have precomputed dictionaries for that salt. And most damning of all, that actually makes phishing attacks attractive for an attacker. That realization dampened my enthusiasm for the whole thing a fair bit.
But perhaps some PAKE-based approach
My intention was to cram something almost as good as a PAKE into the existing framework that WebAuthn provides, but given the above, it is clear to me that it isn't almost as good as a PAKE, so yes, I agree.
Thanks for humoring me!
Thank you for the discussion!
2024-07-17 WG call: discussion seems to be finished, closing.