starlette icon indicating copy to clipboard operation
starlette copied to clipboard

Detect blocking calls in coroutines using BlockBuster

Open cbornet opened this issue 11 months ago • 4 comments

Summary

This PR uses the blockbuster library to detect blocking calls made in the asyncio event loop during unit tests. Avoiding blocking calls is hard as these can be deeply buried in the code or made in 3rd party libraries. Blockbuster makes it easier to detect them by raising an exception when a call is made to a known blocking function (eg: time.sleep).

Checklist

  • [x] I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
  • [x] I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • [x] I've updated the documentation accordingly.

cbornet avatar Feb 01 '25 10:02 cbornet

@cbornet I'm curious about the choice to call out file IO as blocking. In my experience file IO is tricky:

  1. The issue with blocking calls is not if they are theoretically blocking or not, the issue is how long they block for. E.g. sorting a list is blocking work but as long as it's <1ms it's probably fine. When you start getting into multi ms blocks is when things get ugly since IO protocols that expect a regular tick and such may start timing out, causing cascading failures.
  2. File IO can of course block for a long time (e.g. reading GBs of data in one go) but it can also be very fast (small operation on an SSD disk). And it's hard to know ahead of time which one it will be.
  3. The overhead of starting up threads is non negligible, and anyio's to_thread has a relatively low semaphore (40 last time I checked) which is easy to accidentally exhaust, possibly leading to deadlocks.

Because of this my general approach has been to be conservative with chucking things into threads, especially file IO since it's often not a problem in practice.

adriangb avatar Feb 01 '25 15:02 adriangb

@adriangb thanks for your feedback. I agree with you that the impact of blocking calls depends a lot on the time it blocks. But how to know in advance ? Things can run perfectly well on your laptop or CI with SSDs and fall apart when you deploy to AWS with EFS slow network disks (bad recent personal experience). I’ve seen that in general file ops are deferred to threads (or if using Linux, you can use aiofile which has true async support). And I see that starlette already does it in a bunch of places (eg: UploadFile.write) Starting a thread has a cost but I think anyio uses a thread pool ? Anyway, if you don’t want to use threads, it’s possible to:

  • set exemptions in the places detected by blockbuster. This way, you still have a kind of warning that something could be better.
  • or completely disable file IO blockings detection by blockbuster.

cbornet avatar Feb 01 '25 19:02 cbornet

Hi. Could you provide guidance on what I should do for this PR ?

cbornet avatar Feb 24 '25 17:02 cbornet

Hi. Could you provide guidance on what I should do for this PR ?

Hi. I rebased. Is there anything I can do to make progress on this PR ?

cbornet avatar Apr 06 '25 16:04 cbornet