django icon indicating copy to clipboard operation
django copied to clipboard

Improved performance of csrf_input

Open smithdc1 opened this issue 2 years ago • 2 comments

Hi @kezabelle -- would you be interesting in reviewing this?

The relevant lines from my (messy) IPython tests follow. I see a c.8% gain here as it avoids a number of function calls. I think this is safe as it's a string that we're in control of and we know that we're going to get sane values back as the token. 🤞 the full test suite passes.


In [8]: r = HttpRequest()

In [10]: def csrf_input_new(request):
    ...:     token = get_token(request)
    ...:     return SafeString(
    ...:         f'<input type="hidden" name="csrfmiddlewaretoken" value="{token}">'
    ...:     )
    ...: 

In [13]: %timeit csrf_input_new(r)
252 µs ± 4.58 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [15]: def csrf_input(request):
    ...:     return format_html(
    ...:         '<input type="hidden" name="csrfmiddlewaretoken" value="{}">',
    ...:         get_token(request),
    ...:     )

In [16]: %timeit csrf_input(r)
272 µs ± 2.88 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

p.s. In my testing get_token is takes something in the 250µs-255µs range(!)

smithdc1 avatar Jan 17 '22 21:01 smithdc1

If there is consensus this change is acceptable then there are a few similar cases we should consider. The most obvious being CsrfTokenNode

smithdc1 avatar Jan 18 '22 13:01 smithdc1

Computers forever proving they're hard; I see an improvement, as you did, though the magnitude is different:

In [9]: %timeit csrf_input_new(r)
74.5 µs ± 516 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [10]: %timeit csrf_input(r)
80 µs ± 755 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

The CSRF stuff is complex enough and defensive enough that I can't immediately tell if there's any implications from assuming it's trusted -- it looks like in this case it is maybe trustable, because it'll either give back a masked value based on ascii_letters + digits or I think raise an exception or reject the incoming value (due to length or invalid characters by regex search) elsewhere. I'm less certain about CsrfTokenNode because that resolves the value out of the context, assuming it has come from the context processor ... but what if it didn't? I dunno :)


Edit: From a quick eyeball at the line-profiling, it looks like most of the cost is coming from secrets.choice (which is SystemRandom.choice) so there's probably not much scope for a bigger improvement; whatever can be eked out is probably the best that can be done... But you could possibly make the case for pre-computing the CSRF_ALLOWED_CHARS by index so that you don't need to do chars.index N times in _mask_cipher_secret, which might gain you a little something-something; e.g. CSRF_ALLOWED_INDEXES = {x: i for i, x in enumerate(CSRF_ALLOWED_CHARS)} and then CSRF_ALLOWED_INDEXES[x] for x in secret) or whatever; and re-calculating the len(chars) N times doesn't look necessary. Not sure how much it'd claw in terms of performance, but floating it for you if you wanted to look.

kezabelle avatar Jan 19 '22 17:01 kezabelle