merlin
merlin copied to clipboard
Fix Env.check_state_consistency to check for cmi on build path
See the first commit which adds a test for a case where build path config is being reconfigured.
I found this edge case by checking if merlin refreshes diagnostics if I remove a library from dune config. Turns out it didn't. This is a fix.
Unless I missed something, your first commit doesn't expose the buggy behavior? (I think this might be due to the fact that using server in tests is unfortunately somewhat flaky right now)
That being said, I think I understand what's going on and agree with your diagnostic indeed. However, the fix you propose can have pretty terrible performance implications (find_in_path_uncap is very costly).
If that's the only fix available, I think I'd rather not fix this issue, with the rational that if merlin reports fewer errors than it should, it's not the end of the world.
But I have the intuition that we can probably do something cheaper than this (for instance, we could try to find the cmi only if the directory in which we found it initially is not in our loadpath anymore). Would you care to try something like that?
Unless I missed something, your first commit doesn't expose the buggy behavior? (I think this might be due to the fact that using server in tests is unfortunately somewhat flaky right now)
I think it does — the case is — cmi is present but not in load path anymore (due to reconfiguration).
However, the fix you propose can have pretty terrible performance implications
Yes, I can see that. Another fix proposal — what do you think if we drop all cached typing envs in case config changes? I think it's a relatively rare event so users won't be hitting it much and also looks like standling on a "more correct" side of things.
Merlin used to do something similar and that was too costly on some setup. It was changed in 2cf9fd01275a594f22afed3287dc4bebfb9b9f17.
I agree with the corrections you suggest: detecting if the configuration changed and doing the expensive stuff only in this case (either the fine-grained lookup based on find_in_path_uncap or just reloading everything).
If people want to try the "let's just reload everything when the config changes" approach first (which makes sense, it's much simpler), I'd make the request that we hold off for a couple of weeks.
I'm still in the process of setting up some architecture to retrieve reliable numbers (including replies latency) from merlin's usage at janestreet. Once all that is in place and working properly, it will be a good way to evaluate if that solution is reasonable or not.
(By hold off I mean: before merging in master, you're off course free to implement it as soon as you can :p)
That being said, I think I understand what's going on and agree with your diagnostic indeed. However, the fix you propose can have pretty terrible performance implications (find_in_path_uncap is very costly).
The way this function is used in this PR seems incredibly inefficient. Instead of file_exists for every single cmi in every single directory in the load path, the load path can simply be listed with Sys.readdir. The cmi's can be checked against such a contents array, one dir one a time.
Apologies if I'm missing something simple here.