wordpress-develop #21022 Switch to using bcrypt for hashing passwords and security keys

Latest approach to switching from phpass to bcrypt via the native PHP password hashing functions.

This covers:

User passwords
User password reset keys
User request keys
Application passwords
Post passwords
The recovery mode key

Tickets

Trac ticket: https://core.trac.wordpress.org/ticket/21022 Trac ticket: https://core.trac.wordpress.org/ticket/50027

Notes

wp_check_password() remains compatible with phpass hashes, so password checks will continue to succeed when checking a password against its existing hash. There is no need for users to change or resave their password.
User passwords are rehashed using bcrypt after successful authentication in wp_authenticate_username_password() or wp_authenticate_email_password().
Application passwords are rehashed using bcrypt after successful authentication in wp_authenticate_application_password().
wp_check_password() and the rehashing that occurs in the above functions are forward-compatible with future changes to the default bcrypt cost (the default bcrypt cost may get increased in PHP 8.4), so password checks will continue to succeed when checking a password against its existing hash.

Elsewhere

Todo

[ ] Decide on an approach for the 72 character limit imposed by bcrypt
- Messaging on make/core
  - phpass is secure enough, this is about improving, hardening, and future proofing
- Do we need to inform the user when they have a password longer than 72 chars?
- 72 character limit on the password input field -- need an active warning rather than just the maxlength attribute
[x] Decide whether wp_hash_password() and wp_check_password() need to retain back-compat support for the global $wp_hasher
[x] Fully support changes to the default bcrypt cost
[x] Tests for the hash upgrade routine for a user password
[x] Tests for the hash upgrade routine for an application password
[ ] Tests for handling post password hashes
[ ] Find more todos

Sep 11 '24 18:09 johnbillion

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Unlinked Accounts

The following contributors have not linked their GitHub and WordPress.org accounts: @soatok, @calvinalkan, @joehoyle, @raphaelahrens.

Contributors, please read how to link your accounts to ensure your work is properly credited in WordPress releases.

Core Committers: Use this line as a base for the props when committing in SVN:

Props johnbillion, mslavco, timothyblynjacobs, xknown, haozi, mbijon, paulkevan, synchro, dd32, otto42, desrosj.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

Sep 11 '24 18:09 github-actions[bot]

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

The Plugin and Theme Directories cannot be accessed within Playground.
All changes will be lost when closing a tab with a Playground instance.
All changes will be lost when refreshing the page.
A fresh instance is created each time the link below is clicked.
Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance, it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

Sep 11 '24 18:09 github-actions[bot]

I see @johnbillion beat me to the general strategy in his other pull request. I think HMAC-SHA512 is a better approach than SHA384, but both are acceptable.

There are libraries that do SHA384 (not HMAC), but then encrypt the password hash with a key. I think the operational security of both approaches is congruent (you need a key to begin password cracking attempts).

In my opinion, doing SHA384 without HMAC, and without encryption, is less beneficial than the other two approaches. However, al 3 are better than raw bcrypt. And bcrypt is a giant leap in security over phpass.

Nov 20 '24 17:11 soatok

Please refer OWASP's documentation https://cheatsheetseries.owasp.org/cheatsheets/Password_Storage_Cheat_Sheet.html#pre-hashing-passwords-with-bcrypt, they don't recommend pre-hashing passwords when using bcrypt, including use base64 to encoding the sha hash result.

Perhaps we can add input length restrictions to prevent users from using too long passwords?

Nov 21 '24 14:11 devhaozi

I covered that a bit in the FAQ. The main problem is that 72 bytes != 72 characters once you get outside of ASCII, and you'll need active warnings instead of just a maxlength attribute due to the input masking in browsers.

Nov 21 '24 14:11 johnbillion

I covered that a bit in the FAQ. The main problem is that 72 bytes != 72 characters once you get outside of ASCII, and you'll need active warnings instead of just a maxlength attribute due to the input masking in browsers.

In this case, I prefer to ignore them, like Laravel framework. Using SHA for pre-hashing will cause more trouble in future upgrades (assuming we need to upgrade to argon2id one day) and will not benefit the vast majority of users and will reduce performance.

Nov 21 '24 15:11 devhaozi

Using SHA for pre-hashing will cause more trouble in future upgrades (assuming we need to upgrade to argon2id one day) and will not benefit the vast majority of users and will reduce performance.

Can you be specific? What "more trouble" are you envisioning?

Nov 21 '24 15:11 soatok

Using SHA for pre-hashing will cause more trouble in future upgrades (assuming we need to upgrade to argon2id one day) and will not benefit the vast majority of users and will reduce performance.

Can you be specific? What "more trouble" are you envisioning?

If we upgrade to Argon2 one day, all pre-hashing passwords will need to be re-hashed (Argon2 does not have the 72-byte limit), or we can continue to keep base64+sha384, but this is not necessary in argon2.

If we use PHP's default implementation directly, we won't have these troubles (PHP will handle it automatically in future version).

Nov 21 '24 15:11 devhaozi

Using SHA for pre-hashing will cause more trouble in future upgrades (assuming we need to upgrade to argon2id one day) and will not benefit the vast majority of users and will reduce performance.

Can you be specific? What "more trouble" are you envisioning?

If we upgrade to Argon2 one day, all pre-hashing passwords will need to be re-hashed (Argon2 does not have the 72-byte limit), or we can continue to keep base64+sha384, but this is not necessary in argon2.

If we use PHP's default implementation directly, we won't have these troubles (PHP will handle it automatically in future version).

Consider:

if (hash_equals('$2', substr($pwhash, 0, 2)) {
    $prehash = base64_encode( hash_hmac( /* ... */ ) );
    if (password_verify($prehash, $pwhash)) {
        $newHash = password_hash($password, /* ... */).
    }
}

Nov 21 '24 15:11 soatok

Another example for reference is https://github.com/laravel/framework/pull/49752 , where they seem to have decided to do nothing.

Nov 21 '24 15:11 devhaozi

That's correct. Out of the PHP projects that I listed in the PR, only symfony/password-hasher includes any handling for passwords greater than 72 bytes, and even then it still truncates the resulting hash due to the use of sha512+base64 which outputs 88 bytes.

The discussion around hash layering feels dangerously close to rolling your own crypto in order to provide a small amount of additional benefit to an exceptionally small proportion of users.

Nov 21 '24 16:11 johnbillion

Thankfully, this isn't anywhere near "rolling our own crypto". The SHA2+base64 paradigm has been implemented in several proprietary systems I've worked on over the past decade.

I also cited password_lock above as an open source implementation that does a morally equivalent job. If you need a cryptography expert to sign off on this design, ask the developer of that library to review whatever we propose.

and even then it still truncates the resulting hash due to the use of sha512+base64 which outputs 88 bytes.

SHA-384 and SHA-512 are the exact same algorithm, except SHA-384 has a different initialization vector and is truncated to 384 bits. I don't think it makes any sense to obsess over the truncation here.

If you base64-encode a raw SHA-512 hash to get 88 bytes, and then bcrypt truncates it to a "mere" 72, you still have an input keyspace of 6^72 or 486 bits of entropy. This is still in far excess of the 128 bits of entropy necessary to be considered secure.

The SHA-2 family of hash functions is secure in most usages (exception: if you're building a MAC by truncating a key and the message). Using HMAC eliminates length-extension, and you can treat HMAC with a secret key as a random oracle.

If you want to be extra paranoid, instead of base64_encode, prefer sodium_bin2base64 if it's available.

At the risk of being mildly self-promotional, I'd like to refer any curious parties to some of my previous writing on this subject if anything I've said here (or are likely to say in subsequent comments) is a little over your head:

These algorithms, and this specific way of combining these algorithms, is well-understood.

Nov 21 '24 16:11 soatok

Thankfully, this isn't anywhere near "rolling our own crypto". The SHA2+base64 paradigm has been implemented in several proprietary systems I've worked on over the past decade.

I also cited password_lock above as an open source implementation that does a morally equivalent job. If you need a cryptography expert to sign off on this design, ask the developer of that library to review whatever we propose.

and even then it still truncates the resulting hash due to the use of sha512+base64 which outputs 88 bytes.

SHA-384 and SHA-512 are the exact same algorithm, except SHA-384 has a different initialization vector and is truncated to 384 bits. I don't think it makes any sense to obsess over the truncation here.

If you base64-encode a raw SHA-512 hash to get 88 bytes, and then bcrypt truncates it to a "mere" 72, you still have an input keyspace of 6^72 or 486 bits of entropy. This is still in far excess of the 128 bits of entropy necessary to be considered secure.

The SHA-2 family of hash functions is secure in most usages (exception: if you're building a MAC by truncating a key and the message). Using HMAC eliminates length-extension, and you can treat HMAC with a secret key as a random oracle.

If you want to be extra paranoid, instead of base64_encode, prefer sodium_bin2base64 if it's available.

As mentioned before, passwords longer than 72 bytes are rarely used, so it is worth considering whether it is worth doing this.

For sodium_bin2base64, it is impossible, WP usually only accepts PHP's own implementation (which is why Argon2 is not used, Argon2 requires PHP to use additional compilation parameters).

There is still a long time before 6.8, let's see what other developers think.

Nov 21 '24 16:11 devhaozi

For sodium_bin2base64, it is impossible,

https://github.com/WordPress/wordpress-develop/blob/71a52ced57ad8cb9fc52681abb5280c56a0b0a6c/src/wp-includes/sodium_compat/lib/php72compat.php#L146-L162

:)

Nov 21 '24 16:11 soatok

For sodium_bin2base64, it is impossible,

https://github.com/WordPress/wordpress-develop/blob/71a52ced57ad8cb9fc52681abb5280c56a0b0a6c/src/wp-includes/sodium_compat/lib/php72compat.php#L146-L162

:)

If so, is there a pure PHP implementation of Argon2? Using Argon2 directly would not have these annoying problems.

Nov 21 '24 16:11 devhaozi

I've covered argon2 and scrypt in the PR description. Unfortunately there is not.

Nov 21 '24 16:11 johnbillion

As far as I'm aware, sodium_compat does not include scrypt or argon2.

Nov 21 '24 16:11 soatok

Again, I've covered this already in the PR description. sodium_compat still requires the optional libsodium extension in order to support argon2 or scrypt. There is no polyfill for either in sodium_compat.

I think we should put a pin in this conversation about the 72 byte limit of bcrypt. It's an interesting computer science problem but it's so far removed from any real life problem that it distracts from assessing the underlying switch to bcrypt.

A 72 byte password has 576 bits of entropy. As soon as a password is over 16 bytes in length, you're beyond the widely recommended 128 bits of entropy to consider a password secure. There is no practical need to retain a higher entropy for passwords greater than 72 bytes.

Nov 21 '24 16:11 johnbillion

it's so far removed from any real life problem that it distracts from assessing the underlying switch to bcrypt.

I respectfully disagree, for two reasons:

WordPress has historically bent over backwards to accommodate a small minority of users. This is evident in the minimum PHP bump from 5.2.4 to 5.6 taking so many years after the PHP team stopped supporting it. That only a few users will see a tangible difference has, historically, not been a valid reason to make a security change.
Bcrypt + SHA2 + Base64, as discussed above, materially improves the security and raises the bar for the PHP ecosystem, making WordPress a leader rather than a follower.

The 72-byte limit isn't the only footgun of vanilla bcrypt. Truncating on NUL characters is the other (which is why eschewing the base64 encoding is a bad move).

That being said, consider users with long passphrases. but where each word in the passphrase is low-entropy (e.g. diceware). They're well-served by this change.

That being said:

I think we should put a pin in this conversation about the 72 byte limit of bcrypt.

That's fine. This is my final word on this topic, unless explicitly queried by another user.

Nov 21 '24 17:11 soatok

it's so far removed from any real life problem that it distracts from assessing the underlying switch to bcrypt.

I respectfully disagree, for two reasons:

WordPress has historically bent over backwards to accommodate a small minority of users. This is evident in the minimum PHP bump from 5.2.4 to 5.6 taking so many years after the PHP team stopped supporting it. That only a few users will see a tangible difference has, historically, not been a valid reason to make a security change.

Bcrypt + SHA2 + Base64, as discussed above, materially improves the security and raises the bar for the PHP ecosystem, making WordPress a leader rather than a follower.

The 72-byte limit isn't the only footgun of vanilla bcrypt. Truncating on NUL characters is the other (which is why eschewing the base64 encoding is a bad move).

That being said, consider users with long passphrases. but where each word in the passphrase is low-entropy (e.g. diceware). They're well-served by this change.

That being said:

I think we should put a pin in this conversation about the 72 byte limit of bcrypt.

That's fine. This is my final word on this topic, unless explicitly queried by another user.

The criterion for WordPress to stop supporting PHP is the usage rate (as far as I know, this value is less than 5%). I think the percentage of users using passwords longer than 72 bytes is much less than 5%.

Nov 21 '24 17:11 devhaozi

I'm with @johnbillion on straight password->bcrypt.

It's counterintuitive, but it will be more secure than password->sha512->bcrypt. Plus, why are we arguing about 470 vs 500+ bits of entropy when ANY form of bcrypt is >30,000,000 more time-consuming to brute force than md5.

The cryptographers who designed bcrypt were well aware it limited possible entropy to ~470 bits. They compromised there on purpose. At modern cost factors it's 1,000's of years of complexity to brute force a 72 character password. But that limitation prevents resource exhaustion DoS attacks on lower-end servers.

Hashing password->base64(sha512)->bcrypt is less secure in several ways. It enables breach correlation / shucking more easily than raw bcrypt. Plus because of the base64 encoding it LOSES entropy from ALL input passwords. This gets dangerous for the large % of users who have 10 char passwords. They start with a max of ~65 bits of entropy, then clipping from 88 -> 72 characters loses some of that already limited entropy. Progressively more pre-bcrypt steps to deal with this clipping just risks opening DoS vectors.

...meanwhile most users (70-80% per HaveIBeenPwned) are using common, short, and/or recycled passwords. If we're really worried about password security we should raise WP's minimum password length, not bikeshed something cryptographers already considered when designing bcrypt.

Nov 25 '24 20:11 mbijon

@mbijon Your comment is wrong on several levels, including one I had been ignoring in the previous discussion because I didn't want to derail it with pedantry. Unfortunately, this pedantry is now necessary.

It enables breach correlation / shucking more easily than raw bcrypt. Plus because of the base64 encoding it LOSES entropy from ALL input passwords. This gets dangerous for the large % of users who have 10 char passwords. They start with a max of ~65 bits of entropy, then clipping from 88 -> 72 characters loses some of that already limited entropy. Progressively more pre-bcrypt steps to deal with this clipping just risks opening DoS vectors.

The most obvious evidence against this statement is the real-world security impact of bcrypt's truncation.

It enables breach correlation / shucking more easily than raw bcrypt. Plus because of the base64 encoding it LOSES entropy from ALL input passwords. This gets dangerous for the large % of users who have 10 char passwords.

This is nonsense, which I will explain below.

They start with a max of ~65 bits of entropy, then clipping from 88 -> 72 characters loses some of that already limited entropy.

This is also nonsense, which I will explain below.

Progressively more pre-bcrypt steps to deal with this clipping just risks opening DoS vectors.

The performance impact of a SHA512 hash compared to a password hashing algorithm is negligible. You can benchmark it yourself.

<?php
// Initialize up front
$start = $stop = microtime(true);

// reasonably long example "password"
$x = str_repeat('A', 128);
$y = hash('sha512', $x, true); 
$i = 0;


$start = microtime(true);
for ($i = 0; $i < 100_000; ++$i) {
    $y = hash('sha512', $y, true); 
}
$stop = microtime(true);

echo '100,000 iterations of SHA-512 took ', number_format($stop - $start, 3), ' seconds.', PHP_EOL;

$start = microtime(true);
$y = password_hash($x, PASSWORD_BCRYPT);
$stop = microtime(true);

echo '1 iteration of bcrypt took ', number_format($stop - $start, 3), ' seconds.', PHP_EOL;

Here's the output on my machine:

100,000 iterations of SHA-512 took 0.039 seconds.
1 iteration of bcrypt took 0.045 seconds.

We aren't doing 100,000 iterations of SHA-512 here, only 1 or 2 (in the case of HMAC).

The DoS risk for this pre-bcrypt hashing isn't real and can't hurt you.

...meanwhile most users (70-80% per HaveIBeenPwned) are using common, short, and/or recycled passwords. If we're really worried about password security we should raise WP's minimum password length, not bikeshed something cryptographers already considered when designing bcrypt.

You... do realize that a lot of cryptographers don't like bcrypt, right? That's why the whole Password Hashing Competition existed. We settled on Argon2id and yescrypt. The whole cryptography community was there.

EDIT - On My Remark About Cryptographers' Dislike of Bcrypt

I've received feedback that the previous statement wasn't precise (which is fair; I was aiming for clarity, not precision). So here's a bit of a pedantic explanation.

When comparing Bcrypt and Argon2:

Bcrypt is cache-hard, while Argon2 is only cache-hard if you use the non-standard ds variant.
Bcrypt is not suitable as a KDF (unless you do something very nonstandard).

Some of the Password Hashing Competition judges have lamented that they didn't prioritize cache-hardness over memory-hardness, and they recommend bcrypt instead of Argon2 because Argon2 is better as a KDF than a password hash for low latency settings.

I wrote about a lot of these nuances in my blog post, which was linked above.

One of the PHC judges went on to design an algorithm called bscrypt to provide stronger cache-hardness guarantees. But then he found a vulnerability in his own code, and he urged people not to use it until it's fixed. Perhaps in a year or two we can have a successor to the PHC that focuses on password hashing, not password KDFs.

When I said "cryptographers don't like bcrypt", I mean something very specific:

If you walk up to a cryptographer and say, "I'm designing an authentication system. It uses bcrypt for password hashing." you will get one of two responses:

"Is there anything preventing you from using Argon2 or scrypt?"
"Okay. I don't care which you use, as long as it's a password hashing function and not, like, MD5."

Very rarely will you hear, "Okay, yes, thank you for using bcrypt and not scrypt or Argon2."

Separately, OWASP (which is not famously a home for cryptographers but is highly regarded in appsec circles) in particular wants bcrypt to die in a fire precisely because of the 72-char and null-truncation vulnerabilities.

I'm personally in the "as long as it's not, like, SHA256(password), and you're using an actual password hash function" camp, I just wanted to suggest removing the bcrypt footguns.

On Password Shucking

This isn't a real concern for bcrypt(SHA512()) because the previous password hash was phpass, which is based on MD5.

If we were storing the SHA512 hash of the password somewhere, a) that's stupid and b) we should domain-separate our use of SHA512. But I don't think WordPress or any plugins are doing that particular snafu, so it's unlikely to matter.

Revisiting the Misleading Claims About Entropy

The competing proposals looks like this (vaguely):

Hash (or HMAC) the password with SHA-512.
- SHA-384 is morally congruent to SHA-512, since it's the same exact algorithm but with truncation built-in, and a distinct IV for domain separation. The round functions, word size, and operations are preserved between the two. You don't need to care about this distinction.
Base64-encode the hash output from step 1.
Use the output of step 2 with password_hash or password_verify.

SHA-512 is a hash function. Aside from length extension (which isn't relevant to our threat model), you can model any of the SHA-2 family hash functions as a Random Oracle.

SHA-512 maps an input domain of 256^(2^(63)-1) possible inputs to a mere 512 bits. The input isn't guaranteed to be uniformly random (as they are passwords), but the output is.

Base64-encoding the output of a hash function does not "[enable] breach correlation / shucking more easily than raw bcrypt". At the layer of base64, we have a uniformly random distribution of 512 bits, which gets encoded as slightly more characters from the base64 alphabet. There is no entropy loss at the base64 layer.

The base64 encoding does not, at all, "[lose] entropy from ALL input passwords."

The "entropy" of the password chosen is preserved as long as SHA-512 collisions are computationally infeasible.

One objection to my previous statement might be, "But what about the 88 -> 72 truncation?"

This is still secure. 72 characters from the base64 alphabet is still a keyspace representing about 432 bits (or 54 bytes without base64 encoding). This is only 10 bytes short of an untruncated SHA-512, and 6 bytes longer than SHA-384.

The best attack against 432 bits of uncertainty is a birthday collision attack. You'd need about 2^216 samples for a 50% chance of a collision (or 2^144 samples for a 1/2^144 chance of a collision).

2^-100 isn't going to happen, stop trying to make 2^-100 happen .

Any concerns about the reduction of entropy from the strategies we've been discussing are either rooted in a poor understanding of password-based cryptography, or are bikeshedding over differences that might as well be zero to any WordPress user.

Meanwhile, that Okta breach is a real thing that happened because of bcrypt's truncation. Maybe we should learn from real world incidents?

(EDIT: Updated script, increased iteration count to be closer to bcrypt.)

Nov 25 '24 22:11 soatok

Unrelated, I was notified because of this GitHub Actions workflow.

The following contributors have not linked their GitHub and WordPress.org accounts: @soatok.

I'm not going to even have a WordPress.org account until this abomination goes away:

Stupid WPEngine gatekeeping bullshit

I recommend disabling this until WP.org is Free again.

Nov 25 '24 23:11 soatok

Stop it @soatok. You're undermining a clear improvement on md5 with incorrect information:

Okta's breach was caused by using bcrypt outside it's design params as a cache key generator. They prefixed heavily in a way that consumed most of the 72 bytes in bcrypt's input. This PR does the right thing and uses password-only as input and without hashing, as recommended by both OWASP and NIST.

The hobbyist community's concern over collisions between two 73-char passwords passed directly to bcrypt is wrong in two ways:

Most of the examples will have an example where bcrypting 73 chars like aaa...aaa1 and aaa...aaa2 produce the same output. That's some userland mess that doesn't happen as long as you don't pass a salt to PASSWORD_BCRYPT. Because bcrypt internally salts with Blowfish randoms, two users can use the SAME password of ANY length and get different bcrypt outputs.
Anyone using 73+ char passwords is using a generator with some level of randomness and not aaa..., 123...123... or anything remotely simple.

SHA hashing a password does not increase entropy beyond the original password's. SHA may randomize the characters but there's a symbolic (1:1) correlation between those characters and a low-entropy password. Clipping the length of a base64'd string does matter. It reduces the space to store the shortened SHAs of all 10-char passwords.

Regarding cryptographers, NIST does approve of bcrypt and will for several more years. (...it's looking to be past 2029 before FIPS 140-3 validates any modules that might not include bcrypt. More like 2032-2035 if this DOGE department cuts NIST funding).

You are right that Argon2id would be preferable. However the way to make that happen is to push it to the PHP community and not the WP community. md5 however, has been deprecated well over a decade and @johnbillion's approach is the simplest, most recommended way to implement bcrypt.

Nov 26 '24 00:11 mbijon

Stop it @soatok. You're undermining a clear improvement on md5 with incorrect information:

It's generally considered rude to tell strangers to stop "it". Whatever "it" is, in this context. You're nobody's boss here.
Regardless of the choice (which is a one-way door): bcrypt with SHA+base64, or plain bcrypt both materially improve over phpass.

Even if WordPress decides against my recommendations, it's worth discussing the facts so it's an informed decision.

I had rested my case until you came along and posted falsehoods. (Accusing me of "incorrect information" is funny.)

SHA hashing a password does not increase entropy beyond the original password's.

I didn't say it did.

Using SHA2 to hash a password permutes the bits through the entire hash output, which means the reduction of the base64-encoded hash output is not a meaningful security reduction. That is my argument.

I made no claims about the entropy of the input domain, only the output.

SHA may randomize the characters but there's a symbolic (1:1) correlation between those characters and a low-entropy password.

This is a meaningless distinction, with my previous statement in mind.

Clipping the length of a base64'd string does matter. It reduces the space to store the shortened SHAs of all 10-char passwords.

I genuinely do not understand this argument.

If you take a raw SHA-512 hash and encode it with base64, there is no entropy loss due to the encoding of the raw hash, and the truncation of such a large hash still has a sufficiently enormous probability space that it doesn't help attackers.

The best possible attack against a truncated SHA-512 is a collision attack. I provided the risk for bcrypt-sha512+base64: After 2^144 passwords, there is a 2^-144 probability of a collision.

For vanilla bcrypt, it's 2^192 for 2^-192.

If you really want to argue that 2^-192 matters more than 2^-144, I invite you to write a paper for your argument and submit it to a journal for an IACR-associated event (e.g., Real World Cryptography) and then share your inevitable rejection letter publicly.

Analyzing how the algorithms transform data can tell you if there is a meaningful security reduction in the steps being taken. The short answer is: No. The truncation of an encoded SHA-512 hash is still longer (and of the same quality) as a SHA-384 hash, which is what the US government recommends in CNSA. If we were truncating to less than 256 bits, there might be cause for concern, but that's not the recommendation here.

A weak, 10-character password is just as weak whether or not you use SHA-512+base64 as described above. No meaningful weakness is introduced by this process, and it prevents one that has broken real-world systems.

Okta's breach was caused by using bcrypt outside it's design params as a cache key generator. They prefixed heavily in a way that consumed most of the 72 bytes in bcrypt's input. This PR does the right thing and uses password-only as input and without hashing, as recommended by both OWASP and NIST.

The point of writing misuse-resistant cryptography (which is what the bcrypt with SHA-512+base64 is) is that you cannot envision all of the possible ways that an API will be used.

Okta previously used bcrypt in a way that wasn't recommended by experts. Developers do this all the time. See also: JWT libraries that accept alg=none.

If someone decides to write a WordPress plugin that uses the WordPress password hashing code for application passwords, and encodes the passwords as email address || password, and users supply a very long email address:

With phpass, their passwords won't be hashed with a memory-hard password hash which sucks. That's the situation today.
With bcrypt, changing the WordPress password hashing class to vanilla bcrypt will introduce a vulnerability that didn't previously exist, due to the 72 character truncation.
With bcrypt-SHA512+base64, you get the best of both worlds.

You're arguing for the second state, I'm saying the third is the best of both worlds.

In order for it to be correct that I'm "undermining a clear improvement on md5 with incorrect information", I would have to a) be arguing for phpass to continue and b) be stating falsehoods.

Regarding cryptographers, NIST does approve of bcrypt and will for several more years.

This is incorrect. Bcrypt is not a NIST recommend algorithm, and you cannot use it in a FIPS module. You have to use PBKDF2 with a NIST-approved hash function.

NIST does vaguely recommend a "memory hard" function, but they go on to reference Balloon, not bcrypt.

(...it's looking to be past 2029 before FIPS 140-3 validates any modules that might not include bcrypt. More like 2032-2035 if this DOGE department cuts NIST funding).

Are you sure about that?

You are right that Argon2id would be preferable. However the way to make that happen is to push it to the PHP community and not the WP community.

Right. It's a trade-off. I'm not pushing for Argon2id today, I'm telling you what cryptographers prefer today. I know this because cryptography is my job.

Aside: I'm not the first person to recommend this construction. It's had nearly a decade of scrutiny, even if you only look at the PHP community.

It's comical to suggest that vanilla bcrypt is safer than the construction I recommend.

Nov 26 '24 01:11 soatok

Testing There's good test coverage for this change. Here are some manual testing steps that can be taken.

Remaining logged in after the update

We should try testing this with wp session too, both the standard version and any extended implementation if possible - the expectation is that they aren't affected (user would remain logged in), but it wouldn't harm to check.

Note that when a user logs in, WordPress already writes to the database via an UPDATE query on the usermeta table to store their updated user session information in the session_tokens meta field. This happens every time a user logs in.

@johnbillion and I also discussed the potential result of this on large datasets, and assumed it wouldn't result in much higher load, and that a bulk process to update this wouldn't be possible due to how passwords are stored/retrieved. More view on this welcome though!

Nov 26 '24 11:11 pkevan

Another example for reference is laravel/framework#49752 , where they seem to have decided to do nothing.

I made this Laravel PR to put a bcrypt length check into a validation step, rather than having it just truncate silently, but that was rejected too.

Nov 26 '24 11:11 Synchro

Meta comment: I've updated my comment earlier in the thread to clarify one of my statements. I wanted to call attention to this in case it is overlooked.

Nov 26 '24 21:11 soatok

Having read and reviewed this entire PR, I agree with @johnbillion here with retaining the original bcrypt implementation and limitations. Initially I was onboard with SHA384+bcrypt, but after reading, I agree with just bcrypt.

I would suggest that if using bcrypt for >72char is deemed inappropriate, perhaps it should fall back to the current phpass stretched-md5 hashing method, especially if no consensus can be achieved.

That would result in maximum compatibility for existing WordPress users who use the Password hashes outside of WordPress, while also not introducing yet-another-custom-hash into the web where it's not overly obviously necessary, but while still gaining the bcrypt advantages for where it's possible.

Edit: (13/Dec/2024) If the consensus from committers and others here is to pre-hash, I'm onboard with that, this was meant as a "If no consensus can be reached" approach

Nov 28 '24 02:11 dd32

I would suggest that if using bcrypt for >72char is deemed inappropriate, perhaps it should fall back to the current phpass stretched-md5 hashing method, especially if no consensus can be achieved.

I strongly advise against doing this.

While it's generally assumed that the entropy of a 72+ character password is sufficiently good that nobody is going to crack them, even if it's a weaker password hash like phpass, doing so would kill any chance of a world where everyone is on bcrypt or better.

If possible, I'd suggest trying to measure if this is a real problem before committing to it.

/**
 * Is this string longer than the target length?
 * Avoids side-channels.
 * @param string $input
 * @param int $targetLength
 * @return bool true if the length of  $input exceeds $targetLength
 */
function isLongerThanCT(string $input, int $targetLength): bool
{
    $amount = PHP_INT_SIZE === 8 ? 63 : 31;
    $len = strlen($input); // use mb_strlen($input, '8bit'); on older PHP
    $res = (($targetLength - $len) >> $amount) & 1;
    /*
     * It's worth taking a moment to explain what the hell the line above is doing.
     * 
     * ($targetLength - $len) will return a negative value if $targetLength is smaller than $len.
     *
     * By two's complement, negative values when shifted 31 or 63 places to the right will have a lower
     * bit set. We then use a bit mask of `1` and the bitwise `&` operator on the result of the bitshift.
     *
     * End result? It's the same as if we did the following, except constant-time:
     * $res = (int) ($len > $targetLength);
     */
    return (bool) ($res);
}

This function returns true if the string has a length greater than or equal to the target length. This calculation is performed without branches or inequality operators and should therefore have a constant-time execution.

You can use this to measure if the length of users' passwords exceeds some threshold.

Demo: https://3v4l.org/BZK5d

Separately, I wrote a blog post exploring the details of bcrypt with SHA2 titled, Beyond Bcrypt.

Nov 28 '24 13:11 soatok