proselint
proselint copied to clipboard
ci(pre-commit): Prevent cache time-of-use/time-of-check problems
Interally, proselint uses caching to speed up analysis, and the corresponding code is written in a way that makes it obvious that it is not intended to run multiple instances of proselint simulatenously.
For example, the memoize
function in tools.py
suffers from
time-of-use vs. time-of-check problems:
if not os.path.isdir(cache_dirname):
# ...
# While this is running, another process may create the
# `cache_dirname` directory.
# ...
os.makedirs(cache_dirname) # <-- Crashes if `cache_dirname` exists
As the code between the check and the directory creating is quite short and does not take a lot of time to execute, it's unlikely to encounter this issue on a reasonably modern desktop computer, but I ran into this issue multiple times when running proselint via pre-commit on a relatively slow low-powered ARM device in a fresh docker container.
It would be preferable to rewrite the caching code with multiprocessing in mind and by incorporating concepts like "Easier To Ask for Forgiveness, not Permission" [1], but for now, it should suffice to disable parallel execution in pre-commit.