elixir-ls icon indicating copy to clipboard operation
elixir-ls copied to clipboard

[WIP] Add an :ets table to cache the list of files that can be formatted

Open benwilson512 opened this issue 5 years ago • 21 comments

Cached Input List

This PR improves FormatOnSave time by several orders of magnitude, at least when tested on my PC.

Before: 900ms After: ~2-3ms

An O(n), file system bottlenecked operation is replaced by an O(1) in memory operation.

Related

Fixes #385.

Problem

As noted in #385, #381 and reflected in my experience with large projects at CargoSense, FormatOnSave could cause significant to massive delays when trying to save files. After some hunting, the vast majority of the time is spent here:

https://github.com/elixir-lsp/elixir-ls/blob/master/apps/language_server/lib/language_server/providers/formatting.ex#L50

The job of this function is to figure out if a specific file that is going to be saved should be formatted. The key factor here is the :inputs criterion defined in .formatter.exs which takes the shape of a list of path globs eg:

inputs: [
    "*.exs",
    "config/**/*.exs",
    "apps/*/{config,lib,test}/**/*.{ex,exs}",
    "apps/*/mix.exs"
  ]

The current approach simply calls Path.wildcard on each entry, expanding it into all files based on that glob. Then, the candidate path is tested for membership in that list.

Proposed Approach

This PR does NOT change the overall game plan. There is still a list of files which is produced by calling Path.wildcard on the input list, and a candidate path is still tested for membership. However, the main insight here is that this list very rarely changes. This PR populates that list when the LanguageServer boots and caches it in an :ets table set. Then, testing membership becomes an extremely fast process, just an :ets.lookup call. These :ets calls are serialized through the GenServer to avoid various race conditions, but they are such a small percentage of the overall format time that I'm not worried about it being a bottleneck.

TODO:

  • [x] Invalidate the cache when .formatter.exs changes.
  • [x] Block formatting until the cache has been initialized so efforts to format don't crash.
  • [x] Block formatting while the cache is being rebuilt.
  • [x] Replace IO.puts with whatever the proper logging thing is.
  • [x] Handle deletion / creation of formattable files
  • [x] Handle deletion / creation of .formatter.exs
  • [ ] Fix tests.
  • [x] Fix "can't find dep" issues.
  • [ ] Validate that cding into an umbrella sub-dir works.

benwilson512 avatar Oct 28 '20 00:10 benwilson512

Any ideas about the test failures? They're getting "Process not found" errors, but the process is in the app supervision tree. Is there some spot in the test setup I should be adding it to?

benwilson512 avatar Oct 28 '20 01:10 benwilson512

Nice @benwilson512 . I've got some points

  1. How about caching formatter opts along with formattable files? I guess it will also need another TODOs:
  2. Invalidate cache when .formatter.exs files are created/deleted
  3. Invalidate cache when formattable files are created/deleted/renamed

lukaszsamson avatar Oct 28 '20 15:10 lukaszsamson

How about caching formatter opts along with formattable files?

Already done!

As for the others, great point! Need to find the right hooks in the LanguageServer to identify these cases.

benwilson512 avatar Oct 28 '20 15:10 benwilson512

Regarding the Fix "can't find dep" issues TODO item I have some local work that I believe can alleviate that pain point (it works by copying over some logic from Mix.Tasks.Formatter.formatter_opts_for_file into ElixirLS, but I hope in the future that could be upstreamed to Elixir). So maybe that should be removed as a goal for this PR and it can be implemented separately, that might be a good idea even if we don't end up going with my approach, to keep the size and feasibility of this PR down.

axelson avatar Nov 16 '20 05:11 axelson

@axelson I mean the issue is that at the moment in this PR dep importing is entirely broken so we really can't merge it in. Hopefully I'll get a chance to fix that in this branch soon.

benwilson512 avatar Nov 16 '20 15:11 benwilson512

Ah, I see. In that case it is important to handle within this PR :+1:

axelson avatar Nov 16 '20 17:11 axelson

@benwilson512 ping :)

Have you had a chance to look at this recently? It would be great to be get it in.

axelson avatar Dec 17 '20 18:12 axelson

Hey folks! Life at house Wilson has gotten a bit exciting with the introduction of my first child a week ago! Although sleep is a bit hard to find I have taken an extended period off of work and hope to have time to mess with this over the holidays.

benwilson512 avatar Dec 22 '20 02:12 benwilson512

Congratulations on your first child! I hope you can find some time to enjoy the holidays between everything :)

axelson avatar Dec 22 '20 18:12 axelson

I can confirm that this isn't really working at all in umbrella apps, so working on that.

benwilson512 avatar Dec 25 '20 18:12 benwilson512

@benwilson512 Some problems with the tests may have solved by https://github.com/elixir-lsp/elixir-ls/pull/488. Try updating to see?

billylanchantin avatar Feb 19 '21 19:02 billylanchantin

@benwilson512 You made a great addition to ElixirLS and I hope you're able to finish it. But if you don't have time, is it fine if I pick it up? I really want to have a fix of the formatting bug in umbrella applications (which is part of this PR). It will make my (and many others) life much easier :)

dsnipe avatar Apr 12 '21 11:04 dsnipe

Hey @dsnipe feel free to take a swing at it. What's the best way to do that? Would it work to make a PR into this branch? I'd definitely be responsive to that PR and its review, I just didn't have time to chase down the test failures or handle some of the various edge cases like "what if this vscode elixir instance has been opened not in a mix project?".

benwilson512 avatar Apr 12 '21 14:04 benwilson512

@benwilson512 Great, then, I'll try to allocate some time and see what I can do.

dsnipe avatar Apr 12 '21 16:04 dsnipe

Hi, all. I've start resolving test error step by step. I start open PR to resolve the compiling error here https://github.com/benwilson512/elixir-ls/pull/1 (it's PR against this PR).

wingyplus avatar May 12 '21 15:05 wingyplus

Hi @benwilson512. What is the status on this one?

martasd avatar Aug 11 '21 14:08 martasd

I wasn't able to get tests to pass for umbrellas, and haven't had time to followup. If someone wants to take over from here that'd be very appreciated.

benwilson512 avatar Aug 11 '21 16:08 benwilson512

I has been sending a pr to start tackle the test suite here https://github.com/benwilson512/elixir-ls/pull/1. I’m not sure someone take a look?

btw, it seem pr got merge conflict. I think we resolve it first before do anything.

wingyplus avatar Aug 11 '21 17:08 wingyplus

I’m would like to take over it if you’re ok. :)

wingyplus avatar Aug 11 '21 17:08 wingyplus

Sure, your best bet is to simply fork my branch into your own repository and then open an independent PR. When you've done that I'll close this one.

benwilson512 avatar Aug 11 '21 19:08 benwilson512

@benwilson512 @wingyplus is it still relevant? How's the speed after recent path glob changes?

lukaszsamson avatar Jan 23 '22 09:01 lukaszsamson

The job of this function is to figure out if a specific file that is going to be saved should be formatted. The key factor here is the :inputs criterion defined in .formatter.exs which takes the shape of a list of path globs eg:

inputs: [
   "*.exs",
   "config/**/*.exs",
   "apps/*/{config,lib,test}/**/*.{ex,exs}",
   "apps/*/mix.exs"
 ]

The current approach simply calls Path.wildcard on each entry, expanding it into all files based on that glob. Then, the candidate path is tested for membership in that list.

Forgive my drive-by comment here: couldn't it work just as well to check the candidate path against the set of globs without expanding them? If so then there would be no need to cache the list of files at all.

tomjakubowski avatar Sep 22 '22 04:09 tomjakubowski

couldn't it work just as well to check the candidate path against the set of globs without expanding them?

There is no function in the Elixir standard library to do this that I know of. If there's a well defined spec or library to compare a candidate path against a glob then we could use that.

I think realistically there's no chance at this point that I'm going to finish this, so I'm gonna close the PR.

benwilson512 avatar Sep 24 '22 20:09 benwilson512

couldn't it work just as well to check the candidate path against the set of globs without expanding them?

There is no function in the Elixir standard library to do this that I know of. If there's a well defined spec or library to compare a candidate path against a glob then we could use that.

A basic glob match function should be fairly easy to write. At first look glob seems like a simplified regex. The complex part is dealing correctly with cwd, symlinks, special paths, case insensitive filesystems and other path oddities

lukaszsamson avatar Sep 25 '22 08:09 lukaszsamson

The complex part is dealing correctly with cwd, symlinks, special paths, case insensitive filesystems and other path oddities

Indeed.

benwilson512 avatar Sep 25 '22 09:09 benwilson512

Anyway, I forgot that we are already using a simple function like that provided by PathGlob package

lukaszsamson avatar Sep 25 '22 13:09 lukaszsamson