csharpier icon indicating copy to clipboard operation
csharpier copied to clipboard

Better caching

Open belav opened this issue 3 years ago • 2 comments

When using CSharpier.MsBuild in a solution with multiple projects often those projects will be formatted at the same time.

The CSharpier cache reads a single file when it starts up, then replaces that file when it finishes. This leads to concurrent formats overwriting the cache file without taking into account any changes that occurred while formatting was taking place. The problem well eventually start to sort itself out, but will never actually be completely up to date.

I can think of a few solutions to the problem.

  1. Store more than a single formatting file - maybe each directory that CSharpier targets should have its own file. But then you don't get as much of an advantage if CSharpier targets subfolders or individual files.
  2. Switch to something like sqllite, which seems overkill
  3. Before writing the file, reread it and merge the two results.

3 seems like the best option to me, and I believe it will be relatively fast. CSharpier could use the last modified date on the file so it won't even need to reread it if the date hasn't changed.

belav avatar Aug 27 '22 16:08 belav

Comments from @shocklateboy92 on the other PR

IMHO, the better non-blocking fix would be to have more than one cache file:

0 to N csharpier processes write to different files in the cache directory, but with a predictable pattern. e.g. suffixed with project name and/or process ID. At read time, it tries to read all the ones that are could be relevant. skips ones with different project names (increasing parallelism by reducing shared resources) deterministically picks cache file if more than one exists with same project (maybe last write wins?) PS: Instead of project name, you could use the hash of the full path to the csproj file (or maybe sln? idk what scope csharpier is normally invoked in)

belav avatar Sep 15 '22 13:09 belav

  1. Before writing the file, reread it and merge the two results.

3 seems like the best option to me, and I believe it will be relatively fast. CSharpier could use the last modified date on the file so it won't even need to reread it if the date hasn't changed.

How does this one solve the race condition? :thinking:

Say there's csharpier process A and B

  1. A reads the file
  2. B reads the file
  3. A merges its changes with the file
  4. B merges its changes with the file
  5. they both write at once
  6. You get corrupt Json

shocklateboy92 avatar Sep 16 '22 01:09 shocklateboy92