expert icon indicating copy to clipboard operation
expert copied to clipboard

Possible to cache the index/state for big projects?

Open amalbuquerque opened this issue 4 months ago • 2 comments

Let me start by thanking the great work you've been doing 🌟

From my tests it has been working super well for "regular-sized" projects.

At work we have a big monolith with more than ~3M lines of Elixir code, and all the language servers we tried more or less worked but:

  1. It took a lot of time to index everything, which meant developers had to wait some time before they could use the LSP tools.
    • This also used a lot of CPU and RAM, I assume due to the full build the language server needs to do
  2. The LSP tools stopped working after a while, usually after some pulls from master when a lot of changes were pulled (we have >100 developers working on the same code base).

To improve the developer experience for these big projects, would it be possible to cache the state/index of the language server, so that when pulling a given version of the project we would also download the corresponding Expert index and all the LSP tools would work immediately?

For context, with a 32 cores desktop + 64GB ram it took ~45minutes for Expert to finish indexing (I've monitored btop and all the cores being used during indexing, until they became mostly idle); folks with Mac laptops are mentioning letting the index run for 3-4 hours before it was usable.

Happy to provide more detailed info/timings; is there a way to collect this kind of telemetry from Expert?

If you already have an idea on how this caching mechanism could fit in the Expert architecture I can try to take a stab at it.

Once more, super thank you for this 🙏

amalbuquerque avatar Aug 29 '25 11:08 amalbuquerque

The indexes are stored in the .expert/indexes folder, are you able to verify if these folder/files aren't being created? Also, can you check your logs/your editor logs to see if it's indexing that is running?

doorgan avatar Aug 29 '25 14:08 doorgan

Sorry for the late reply.

Answering your questions:

The indexes are stored in the .expert/indexes folder, are you able to verify if these folder/files aren't being created?

Yup they are.

Also, can you check your logs/your editor logs to see if it's indexing that is running?

Yup it seems so.

More details

Yesterday I've:

  1. Nuked the .expert folder;
  2. Made sure mix compile && MIX_ENV=test mix compile were executed beforehand;
  3. Opened neovim that is configured to use Expert and opened the mix.exs file (around 11:03pm).

I've also tailed the expert.log and project.log:

Details

Image

After the Expert initialization, I start to see sent notification server -> $/progress messages on the expert.log file.

I made sure nothing else was happening on this machine (32 cores 9950X CPU, 64 GB RAM) afterwards.

During the "indexing", for ~24 minutes I see the $/progress logs but CPU usage remains low until it appears that Compiled tiger in 1264.6 seconds on the expert.log message:

Image

After this moment, I see CPU usage across all cores increasing and some Could not expand alias errors on the project.log:

Image

The sent notification server -> client $/progress messages on the expert.log continue to appear until 11:54pm. When they stop, most of the CPU cores go to idle:

Image

I think it's at this stage that I consider Expert to be done.


Given the long time for Expert to finish, I was thinking whether we could share the .expert folder for a given commit, so that folks would immediately have Expert ready without "paying" the CPU/time cost. Currently the index is 3.8G:

andre@andre-jupiter ~/projs/remote/tiger/.expert/indexes/ets/27.3.3/1.18.3/3 *
❯ l
total 3.8G
drwxrwxr-x 2 andre andre 4.0K Sep  2 23:58 .
drwxrwxr-x 3 andre andre 4.0K Sep  2 23:24 ..
-rw-rw-r-- 1 andre andre 3.8G Sep  3 00:00 07368779273278058496.checkpoint
-rw-rw-r-- 1 andre andre  43K Sep  3 00:02 updates.wal

ℹ️ A .zip with the entire .expert folder without compression is 4.3G, with max compression becomes 554M.

Ask

Having a way to "execute" Expert via CLI so that it indexes a snapshot of the code and then stops would be awesome, since we could use it to build the Expert "cache" on a CI pipeline and then create some tooling to fetch it with a given commit.

Thank you!

amalbuquerque avatar Sep 03 '25 09:09 amalbuquerque