password_compat icon indicating copy to clipboard operation
password_compat copied to clipboard

Generic formula to calculate $raw_salt_len

Open jcoetzee opened this issue 11 years ago • 2 comments

Added a generic formula to calculate $raw_salt_len, it will always return the value for least amount of data required for a given $required_salt_len.

jcoetzee avatar May 14 '13 15:05 jcoetzee

No, it's not wrong. base64('123') = 'MTIz'. 3 characters is not enough to satisfy a salt length of 5, but 4 characters are enough: base64('1234') = 'MTIzNA=='

jcoetzee avatar Dec 12 '13 16:12 jcoetzee

I see now where you're coming from: Given a number n of Base64 digits, you calculate the number of bytes so that the Base64 encoding will have at least n digits.

You can easily see that this makes no sense when you look at the bits. No matter how you interpret the meaning of the variables, you get inconsistent results:

Let $required_salt_len = 2. That's 2 * 6 = 12 bits we can express with the Base64 digits. But your formula tells us to only generate 1 byte of raw salt, which is less than the 12 bits.

Now take $required_salt_len = 5. That's 30 bits. But you tell us to generate 4 bytes of raw salt, which is more than the 30 bits.

I can't both be right.

The general problem is that you cannot simply derive the number of bits required by the algorithm if all you have is the number of Base64 digits. For example, bcrypt uses 22 Base64 digits. This gives us a theoretical salt length of 132 bits. To cover all those bits, we would need 17 bytes of raw salt. In reality, however, bcrypt expects 16 bytes of salt. The remaining 4 bits in the Base64 representation are not used. This can only be explained with the inner workings of bcrypt. There is no general formula for that.

I see two solutions:

We assume that all algorithms operate on full bytes of salt. So we'll never have something like 132 bits of salt, only 8, 16, 32 ... The value of $raw_salt_len would then be the maximum number of bytes so that $raw_salt_len * 8 does not exceed $required_salt_len * 6. This gives us the formula I posted above. Actually, it would make more sense to reverse the formula so that $required_salt_len is calculated from $raw_salt_len.

Or we simply hard-code the data -- which is what the library currently does. In case some future algorithm doesn't use full bytes of salt, we'll have to go down to bits.

But your formula is definitely wrong.

Jacques1 avatar Dec 12 '13 20:12 Jacques1