flask-security icon indicating copy to clipboard operation
flask-security copied to clipboard

Slow token authentication

Open KPilnacek opened this issue 7 years ago • 26 comments

Hello guys,

at my current job, I happened to be a part of a backend team, which is creating an API. The API should be then served to javascript application and needs to be quite fast (100 ms or so). However, it is not. After some profiling, we figured out that it is the token authentication in Flask-security, which is holding us back (pleas see the MWE).

MWE

import flask
from flask_security import Security, SQLAlchemyUserDatastore, UserMixin, RoleMixin, auth_required
from flask_sqlalchemy import SQLAlchemy

app = flask.Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:////tmp/database.sqlite3'
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False
app.config['WTF_CSRF_ENABLED'] = False
app.config['SECURITY_TOKEN_AUTHENTICATION_HEADER'] = 'Authorization'
app.config['SECURITY_PASSWORD_HASH'] = 'pbkdf2_sha512'
app.config['SECURITY_PASSWORD_SALT'] = b'secret'
app.config['SECRET_KEY'] = "super_secret"

db = SQLAlchemy(app)

roles_users = db.Table('roles_users',
                       db.Column('user_id', db.Integer(), db.ForeignKey('user.id')),
                       db.Column('role_id', db.Integer(), db.ForeignKey('role.id')))


class Role(db.Model, RoleMixin):
    id = db.Column(db.Integer(), primary_key=True)
    name = db.Column(db.String(80), unique=True)
    description = db.Column(db.String(255))


class User(db.Model, UserMixin):
    id = db.Column(db.Integer, primary_key=True)
    email = db.Column(db.String(255), unique=True)
    password = db.Column(db.String(255))
    active = db.Column(db.Boolean())
    confirmed_at = db.Column(db.DateTime())
    roles = db.relationship('Role', secondary=roles_users,
                            backref=db.backref('users', lazy='dynamic'))


user_datastore = SQLAlchemyUserDatastore(db, User, Role)

# Setup Flask-Security
security = Security(app, user_datastore)

db.drop_all()
db.create_all()

admin_role = Role(**{'name': 'admin', 'description': 'Admin role'})
db.session.add(admin_role)
db.session.commit()

user_datastore.create_user(email='[email protected]', password='test', active=True, roles=[Role.query.first()])
db.session.commit()


@app.route('/')
@auth_required('basic', 'token')
def hello():
    return flask.jsonify({'hello': 'world'})


if __name__ == '__main__':
    app.run(debug=True)

Basic authentication timing

The timings are perfect (bellow 100 ms) but that is not the way we should do it.

time curl http://127.0.0.1:5000/ -u "[email protected]:test"
{
  "hello": "world"
}


real    0m0.076s
user   0m0.008s
sys     0m0.006s

Token authentication timing

Getting the token is OK.

curl -H "Content-Type: application/json" -X POST -d '{"email":"[email protected]","password":"test"}' http://127.0.0.1:5000/login

{
  "meta": {
    "code": 200
  }, 
  "response": {
    "user": {
      "authentication_token": "WyIxIiwiJDUkcm91bmRzPTUzNTAwMCRFRUpLRFNONlB2L1hzL2lRJDhMWFZvZlpLMmVoa1BVdWtpRlhUR1lvNEJ3T3FjS3dKMVhVWGlOczRwZDMiXQ.DOLjcQ.oBrT4gr1m49rISyxhaj9Lxu1VNk", 
      "id": "1"
    }
  }
}

But than the request is terribly slow. The timings are 20 times slower :flushed: :open_mouth:.

time curl "http://127.0.0.1:5000/?auth_token=WyIxIiwiJDUkcm91bmRzPTUzNTAwMCRFRUpLRFNONlB2L1hzL2lRJDhMWFZvZlpLMmVoa1BVdWtpRlhUR1lvNEJ3T3FjS3dKMVhVWGlOczRwZDMiXQ.DOLjcQ.oBrT4gr1m49rISyxhaj9Lxu1VNk"
{
  "hello": "world"
}

real    0m2.371s
user   0m0.005s
sys     0m0.006s

What with that???

I know that Flask-security wraps together several other flask security packages (Flask-login, Flask-WTF, ...).

  1. Do you know what could be the cause? (is it Flask-security or Flask-login or something deeper?)
  2. It seems that the hashing algorithm, which is slow, is running for every request. However, it might not be necessary to do it every time. It should be enough, to only store the token and check if the incoming token is the same as the stored one. Is there a way to do it like that (either with Flask-security or not)?
  3. Can I set the app (app.config) a different way to make it faster (still using the token auth.)?
  4. Is there a workaround (still using Flask-security)?
  5. Shall I write it myself? Is it the Flask-security holding us back?
  6. Any other ideas?

I have cross-posted this issue on StackOverflow.

KPilnacek avatar Nov 07 '17 11:11 KPilnacek

I suspect the database. Can you check the issued queries (with EXPLAIN) to check if you have all necessary indexes?

jirikuncar avatar Nov 07 '17 12:11 jirikuncar

I don't think that it is the database. It has only two tables with one record each. However, if I run profiler for the GET method with auth_token I get the following stats

Tue Nov  7 14:02:31 2017    /tmp/elis_api_profile/GET.root.002310ms.1510059751.prof
         5125 function calls (4946 primitive calls) in 2.310 seconds
   Ordered by: internal time
   List reduced from 733 to 20 due to restriction <20>
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        2    2.299    1.150    2.299    1.150 {built-in method _crypt.crypt}
        4    0.001    0.000    0.001    0.000 {method 'execute' of 'sqlite3.Cursor' objects}
      387    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
  212/185    0.000    0.000    0.002    0.000 {built-in method builtins.getattr}
        2    0.000    0.000    2.299    1.150 /usr/lib/python3.6/crypt.py:35(crypt)
       91    0.000    0.000    0.000    0.000 .../python3.6/site-packages/werkzeug/local.py:160(top)
  128/124    0.000    0.000    0.000    0.000 {built-in method builtins.hasattr}
       12    0.000    0.000    0.000    0.000 .../python3.6/site-packages/sqlalchemy/sql/selectable.py:2749(_froms)
        4    0.000    0.000    0.000    0.000 .../python3.6/site-packages/sqlalchemy/sql/visitors.py:210(iterate)
        1    0.000    0.000    0.000    0.000 {built-in method _sqlite3.connect}
       24    0.000    0.000    0.000    0.000 .../python3.6/site-packages/sqlalchemy/sql/compiler.py:676(visit_column)
    81/40    0.000    0.000    1.144    0.029 .../python3.6/site-packages/werkzeug/local.py:300(_get_current_object)
       95    0.000    0.000    0.000    0.000 .../python3.6/site-packages/werkzeug/local.py:68(__getattr__)
       18    0.000    0.000    0.000    0.000 .../python3.6/site-packages/sqlalchemy/sql/compiler.py:633(visit_label)
       18    0.000    0.000    0.001    0.000 .../python3.6/site-packages/sqlalchemy/sql/compiler.py:1421(_label_select_column)
        4    0.000    0.000    0.000    0.000 .../python3.6/site-packages/sqlalchemy/orm/loading.py:273(_instance_processor)
     67/4    0.000    0.000    0.001    0.000 .../python3.6/site-packages/sqlalchemy/sql/visitors.py:75(_compiler_dispatch)
        1    0.000    0.000    0.000    0.000 {method 'close' of 'sqlite3.Connection' objects}
    62/45    0.000    0.000    0.002    0.000 .../python3.6/site-packages/werkzeug/local.py:344(__getattr__)
        1    0.000    0.000    0.000    0.000 .../python3.6/site-packages/werkzeug/routing.py:1253(bind_to_environ)

The builtin method crypt eats all the time. And I don't believe it can access the database.

KPilnacek avatar Nov 07 '17 13:11 KPilnacek

White testing your example in clean virtualenv, I have found following warning.

[...]/lib/python3.6/site-packages/passlib/crypto/scrypt/__init__.py:127: PasslibSecurityWarning: Using builtin scrypt backend, which is 100x slower than is required for adequate security. Installing scrypt support (via 'pip install scrypt') is strongly recommended

Can you check if you get it while importing passlib.hash?

jirikuncar avatar Nov 07 '17 14:11 jirikuncar

I can also confirm this issue. At my work we moved away from Flask-Security for our API microservices as this was a huge bottleneck.

See #663 which is the same problem, only exhibited by our test suite. Switching to plaintext for password hashing is not a good idea outside of tests :)

mafrosis avatar Nov 07 '17 14:11 mafrosis

Can you also check the performance of the build-in function separately?

In [2]: %timeit crypt('password', 'secret')
3.15 µs ± 55.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

(_crypt.__file__ is [...]/lib/python3.6/lib-dynload/_crypt.cpython-36m-darwin.so)

jirikuncar avatar Nov 07 '17 14:11 jirikuncar

No warning and installing scrypt doesn't improve times.

Python 3.6.3 (default, Oct  3 2017, 21:45:48) 
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import passlib.hash
>>> quit()

KPilnacek avatar Nov 07 '17 14:11 KPilnacek

timeit(setup="import crypt", stmt="crypt.crypt('password', 'secret')", number=2)/2
2.0448000213946216e-05

And these are actual parameters for crypt...

timeit(setup="import crypt",
       stmt="crypt.crypt('$pbkdf2-sha512$25000$2hsDICSk1Pq/VyqltFZqLQ$O6TlnzzR8/U32Zpg/IGopscAvem3QbGJIXCKJCC0ZwIZpY2JcU4vgcSEsP4RVtoMpPLU5lsXEKouqdIZycbPTA', '$5$rounds=535000$CnKACJ4960/7u1F6$kWG4xrwa3E6nPaP42BhEOup3GUDhMhoHZKbsvPkeXk0')", number=2)/2
1.1937105525012157

KPilnacek avatar Nov 07 '17 14:11 KPilnacek

The hashing scheme used by default is intentionally slow - I believe the bug is that the hash is calculated on every request, which it shouldn't be.

mafrosis avatar Nov 07 '17 14:11 mafrosis

I believe the same.

KPilnacek avatar Nov 07 '17 14:11 KPilnacek

@KPilnacek did you manage to work around this in any way? I'm having the same issue.

caalle avatar Nov 20 '17 14:11 caalle

SECURITY_HASHING_SCHEMES default to sha256_crypt options are:

_allowed_password_hash_schemes = [
 -    'bcrypt',
 -    'des_crypt',
 -    'pbkdf2_sha256',
 -    'pbkdf2_sha512',
 -    'sha256_crypt',
 -    'sha512_crypt',
 -    # And always last one...
 -    'plaintext'
 -]

such as:

    SECURITY_HASHING_SCHEMES = ['pbkdf2_sha512']
    SECURITY_DEPRECATED_HASHING_SCHEMES = []

hardy4yooz avatar Nov 21 '17 09:11 hardy4yooz

Having the same problem. We also have an SLA of 100 ms, and are running into the exact same issue -- around 2.4 seconds for a request, with most of that coming from crypt.

alison-f-cooper avatar Dec 18 '17 15:12 alison-f-cooper

Just a question -- you folks using Linux, on some kind of VM? Install haveged, start it, try again.

pahrohfit avatar Jan 02 '18 22:01 pahrohfit

@caalle at first, sorry for the delay. Well, we bypassed flask-security as we did not find any way to improve the speed. This was done by my colleague, so I can only provide some insights from what I see in the code.

He added a custom loader to flask-security.login_manager

# add our custom callback for user token authorization
security.login_manager.request_loader(check_user_auth_token)

In the function check_user_auth_token, he implemented token deserialization via itsdangerous.URLSafeTimedSerializer, which, and now I'm only guessing, is faster than the implementation in flask-security but does the same job.

However, throughout our time with flask-security, we ran to so many issues (mostly related to its usage with REST API) that we decided to abandon it completely. Moreover, we are thinking about switching to Django and its Django REST Framework.

KPilnacek avatar Jan 03 '18 08:01 KPilnacek

@KPilnacek thank you for providing more info. Too bad you had to switch.

I'll give it some more testing before making a switch to something more lightweight.

caalle avatar Jan 05 '18 09:01 caalle

Came across this thread as I'm updating to Flask-Security v3.0.0 and came across this same issue. Previously, I had already accounted for skipping the slowness caused by the hashing in the login process, but now it seems intense hashing is also done when generating and checking the auth token, which happens in the majority of my tests.

It seems like the "old" functionality can be restored via the following settings:

'SECURITY_HASHING_SCHEMES' = ['hex_md5']
'SECURITY_DEPRECATED_HASHING_SCHEMES' = []

I'm sure this has security implications. But for my application, it's not worth the performance impact. I would assume the same is true for most applications. Adding a 1-2 second overhead to every authenticated endpoint is not acceptable. Is there something I/we are doing wrong here?

ibushong avatar Feb 01 '18 07:02 ibushong

See #663 @ibushong.

Problem is that the bcrypt hash is being calculated on every request- which it doesn't need to be. Bcrypt is slow by design.

mafrosis avatar Feb 01 '18 08:02 mafrosis

(I'll preface this by saying by crypto knowledge is very basic, so anyone please correct me as needed)

@mafrosis Well, to clarify, and as far as I can tell, there are 2 main places where this hashing is being done:

  1. When hashing or checking a user's password
  2. When generating or checking an authentication token

Case #2 is what was significantly slowed down in v3 (with default settings).

Case #1 is certainly important to employ a slow hashing function, to prevent brute force attacks. However, I don't see the need to slow down auth token generation/checking. Brute-forcing an authentication token seems extremely impossible, since they are such long, complicated strings.

ibushong avatar Feb 01 '18 08:02 ibushong

This is a pretty big issue that is causing more than a few people to not use or switch away from flask-security :(

mgd722 avatar Feb 08 '18 03:02 mgd722

Just making a suggestion here and it seems to only be affecting API/token logins from what I am seeing from this and other threads. If this is incorrect please advise. With that assumption could a potential solution be to use a sort of session token once authentication is done. This way it then doesn't have to utilize bcrypt or one of the more robust crypto protocols so that once a user is authenticated a simpler token can then be utilized for the API calls (similar to what was used previously)? After x amount of time without usage it would expire (could also be associated with the IP that was authenticated) and the user would then need to login again and get another session token. Something similar is utilized by a file sharing app that I use for its API calls.

biomap avatar Feb 27 '18 06:02 biomap

this issue is quite annoying, one simple call takes 600ms+, while it takes <100ms when there was no token authentication involved.

zhookz avatar Mar 25 '18 05:03 zhookz

Install and run “haveged” and try again.

From: zhookz [email protected](mailto:[email protected]) Reply: mattupstate/flask-security [email protected](mailto:[email protected]) Date: March 25, 2018 at 01:14:42 To: mattupstate/flask-security [email protected](mailto:[email protected]) CC: pahrohfit [email protected](mailto:[email protected]), Comment [email protected](mailto:[email protected]) Subject: Re: [mattupstate/flask-security] Slow token authentication (#731)

this issue is quite annoying, one simple call takes 600ms+, while it takes <100ms when there was no token authentication involved.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub(https://github.com/mattupstate/flask-security/issues/731#issuecomment-375946204), or mute the thread(https://github.com/notifications/unsubscribe-auth/ABf0cmikTsAfxrY2pi56IrxJpn1kAN-Nks5thyfCgaJpZM4QUrF3).

pahrohfit avatar Mar 25 '18 05:03 pahrohfit

I have the same issue...

waylonwang avatar Oct 20 '18 16:10 waylonwang

(I'll preface this by saying by crypto knowledge is very basic, so anyone please correct me as needed)

@mafrosis Well, to clarify, and as far as I can tell, there are 2 main places where this hashing is being done:

  1. When hashing or checking a user's password
  2. When generating or checking an authentication token

Case #2 is what was significantly slowed down in v3 (with default settings).

Case #1 is certainly important to employ a slow hashing function, to prevent brute force attacks. However, I don't see the need to slow down auth token generation/checking. Brute-forcing an authentication token seems extremely impossible, since they are such long, complicated strings.

It's working, thank you very much!

waylonwang avatar Oct 20 '18 16:10 waylonwang

@jirikuncar, @lnielsen, @mattupstate any thoughts on the suggested solution by @biomap?

If I understand right – and please correct me, If I did get it wrong – it is the hash_data(self.password) in the UserMixin’s get_auth_token() that causes the slow down: https://github.com/mattupstate/flask-security/blob/develop/flask_security/core.py#L396

So the goal seems to be to provide an auth token which is somehow dependent on the password (so that if users change their password it is not valid anymore, see https://github.com/mattupstate/flask-security/issues/771). In the same time it should not directly depend on the password, because then we would have to encrypt it (time-consumingly) – in order to not risk its recreation from the token.

So if there would be a token that gets created on user login, stored to a user, and removed on password change and logout, and this token would have nothing to do with the actual password, would that work?

This would mean some storage/db access overhead when logging in/out + changing the password, but would entirely decouple the auth token from the password, i.e. it could be handled without a time consuming encryption.

Any thoughts? We are facing this issue for quite some while now in a production app, so I’d love to get rid of it. Let me know if I can help in any way.

paulgoetze avatar Mar 11 '19 16:03 paulgoetze

I did this : https://github.com/mattupstate/flask-security/pull/839 this issue is almost the same as : https://github.com/mattupstate/flask-security/issues/771

eregnier avatar Apr 29 '19 19:04 eregnier