warehouse icon indicating copy to clipboard operation
warehouse copied to clipboard

Consider the future of `passlib` and password hashing/upgrading

Open miketheman opened this issue 1 year ago • 8 comments

We use passlib in warehouse as part of our user account management service:

https://github.com/pypi/warehouse/blob/26a3446ada6c2db27e6e608d81508ca25018f389/warehouse/accounts/services.py#L79-L93

The TL;DR of what this does is allows user password hash algorithms to evolve over time, and as users log in with their passwords they are confirmed and replaced with the newer (presumably more secure) hash algorithms, preventing the user from needing to reset a password only to get the latest and greatest algorithm.

passlib hasher docs can be found here: https://passlib.readthedocs.io/en/stable/lib/passlib.hash.html

The most recent release of passlib was in 2020, and raises warnings for using crpyt, which will turn into breakages under Python 3.13, so this is not yet a blocker, it's something we should consider long before it becomes one.

Here's an issue for maintenance status that has yet to be resolved, either by nominating new maintainers, or some other resolution.

In the interim another contender has emerged - pwdlib (author launch blog post), which appears to have argon2 and bcrypt support.

So in theory, we could leverage pwdlib and continue to leverage the upgradability-behavior, however we'd still need to account for folks that have yet to log in in modern times and the lack of older algo support in pwdlib. pwdlib also does not yet support disable() or is_enabled() which we use today, but could be replaced by using the boolean flag User.is_active or such.

Alternately, since it's not urgent yet, we can continue to observe the evolving space around passlib and hope that a new maintenance team arises before it becomes a severe issue.


Some SQL counting:

warehouse=> SELECT
  CASE
    WHEN password LIKE '$argon2%' THEN 'argon2'
    WHEN password LIKE '$bcrypt-sha256%' THEN 'bcrypt_sha256'
    WHEN password LIKE '$2b$%' THEN 'bcrypt'
    WHEN password LIKE 'bcrypt$%' THEN 'django_bcrypt'
    WHEN password LIKE 'spammer' THEN 'disabled'
    WHEN password LIKE '!' THEN 'disabled'
    ELSE 'other'
  END AS hash_type,
  COUNT(*)
FROM users
GROUP BY hash_type
ORDER BY COUNT(*) DESC;
   hash_type   | count
---------------+--------
 argon2        | 671847
 bcrypt_sha256 |  76452
 disabled      |  28087
 django_bcrypt |  10930
(4 rows)

miketheman avatar Feb 22 '24 22:02 miketheman

Some relevant things: https://github.com/canonical/cloud-init/issues/4791

JacobCoffee avatar Jul 10 '24 19:07 JacobCoffee

Hey, pwdlib maintainer here 👋

I understand this is not a critical priority for warehouse at the moment, but if you need specific features and/or algorithms so you can replace passlib, I would be glad to discuss it 🙂

frankie567 avatar Aug 21 '24 04:08 frankie567

Hi @frankie567 ! Thanks for asking.

We currently use passlib "lightly" in warehouse. I linked above the to CryptContext in use - specifically the algos still in use are listed there - I don't think pwdlib has support yet for the algos we have in use today.

We also use the verify_and_update() function, it does most of the heavy lifting:

https://github.com/pypi/warehouse/blob/c59d8bbe3918f586a9e37959451882be742a1849/warehouse/accounts/services.py#L225-L230

Other functions we use are hash(), verify() which I think pwdlib supports, disable(), and is_enabled() which I don't think are supported yet.

miketheman avatar Aug 27 '24 18:08 miketheman

Thank you for those details, Mike 👍

I confirm pwdlib supports verify_and_update, hash and verify.

I'll have a look at the algorithms you use. Regarding the enable/disable feature, I'll consider it, even though currently I believe this is something that should be handled at user's level (with a flag like you suggest) rather than at password's level.

frankie567 avatar Aug 28 '24 11:08 frankie567