racer icon indicating copy to clipboard operation
racer copied to clipboard

Indexing dependencies for features like non FQ completion & automatic import insertion

Open tomc1998 opened this issue 7 years ago • 10 comments

Hello!

So, this issue is more of a general question about how I'd go about implementing a feature. I've wanted automatic import insertion in rust for a long time, and I've been looking through racer's codebase in the hopes of implementing something along these lines.

Let's break it down - to perform automatic import / use statement insertion, we'd basically need non-fully-qualified autocompletion.

Currently, from what I can see, Racer uses the prefixes on names to help search for a completion - for example, if I type std::io::R, that will search in the io module of the std crate for types / functions beginning with R. However, if I'd only typed in R, and hadn't imported std::io or given Racer any cause to search in std::io, there is no mechanism to search through all the dependencies of a given project, so this wouldn't turn up the std::io::Result completion.

If we have non-fully-qualified completions, we can then implement fancier features like automatic use-statement insertion at the top of the file or function after a completion has been selected. This will probably be implemented on the 'front-end' (i.e., in the emacs racer plugin rather than the racer library itself).

I've opened this issue to discuss an implementation of non-fq completions in racer. I've done a couple benchmarks, and found that fetching a whole project's dependencies from the hard drive and parsing them wouldn't take TOO long for medium sized projects. On an old laptop, I was able to fetch and parse 1286 files worth of dependencies (about 23mb of source code) in just over 8 seconds. This is a fairly small-medium sized web project.

After this initial fetch & parse stage, we could strip out all the public modules, functions, and types, then using a crate like app_dirs2 store an indexed list of these (mapped to their respective modules) on the hard drive. This list could be stored alphabetically, and should be pretty easy & fast to binary search in real time once it's in memory. We'd only need to perform the initial indexing once per project on the filesystem.

I'm going to have a play with this idea, but I wanted to ask whether any mechanisms for this were already in place / whether this is a really dumb idea! I don't want to sink hours into looking in the wrong area, or implementing something that would never get merged because the performance is too poor.

Any comments / suggestions regarding this?

tomc1998 avatar Jun 05 '18 14:06 tomc1998

I'd suggest talking to @nrc or others on the RLS team before going too far down this path in Racer; they may have ideas about where best to do this, and if your efforts can help them then that'd be awesome.

TedDriggs avatar Jun 05 '18 15:06 TedDriggs

Ahhhh yeah I was wondering if this'd be better in the RLS, is that @ tag gonna send him/her a notification or should I open a similar issue in the RLS repo?

tomc1998 avatar Jun 05 '18 15:06 tomc1998

Currently, from what I can see, Racer uses the prefixes on names to help search for a completion - for example, if I type std::io::R, that will search in the io module of the std crate for types / functions beginning with R. However, if I'd only typed in R, and hadn't imported std::io or given Racer any cause to search in std::io, there is no mechanism to search through all the dependencies of a given project, so this wouldn't turn up the std::io::Result completion.

Hmm... Though I'm planning to implement stateful API for racer, I don't think those kind of feature is useful. Maybe it returns ::std::rc::Rc, ::std::io::Read, ::std::io::Result, ::std::cell::RefCell, ..

After this initial fetch & parse stage, we could strip out all the public modules, functions, and types, then using a crate like app_dirs2 store an indexed list of these (mapped to their respective modules) on the hard drive.

Yeah we need something like import caching, also for #844(my cause for headache :fearful: ). I'm really welcome to your PR.

But, I have some concern:

  • Racer have some problems for visibility resolving
  • What kind of API we should have for completing ::Result for use std::io?

And some instructions:

This list could be stored alphabetically, and should be pretty easy & fast to binary search in real time once it's in memory. We'd only need to perform the initial indexing once per project on the filesystem.

  • I think just HashMap is enough.
  • We're getting use items here

kngwyu avatar Jun 05 '18 15:06 kngwyu

Though I'm planning to implement stateful API for racer, I don't think those kind of feature is useful. Maybe it returns ::std::rc::Rc, ::std::io::Read, ::std::io::Result, ::std::cell::RefCell, ..

Obviously the idea would be that you'd have more 'full' completions, so if you're using a web-application library and you want to create a request handler, the signature would probably look something like

fn handle (req: Request) -> Response

It's a MASSIVE pain to have to dig into the docs to remember which modules the Request and Response types are exported in, the add them to the top of the file. Ideally what I'd want is something where I typed in 'Reques', and this is autocompleted to 'Request' with the import inserted at the top.

Racer have some problems for visibility resolving

You mean, checking whether something is exported publicly? I hadn't thought about this, I just passed a load of code through the syntax crate parser. I assumed that marks stuff as visible?

What kind of API we should have for completing ::Result for use std::io?

You mean, how would we insert the imports into the file?

I think we can simplify the API significantly by only offering 1 additional 'endpoint', OR just modifying the existing 'complete' endpoint. Currently you can run something like racer complete std::io::R and this will return all the completions on the command line. If instead, you could just use racer complete R, and that returned the completions, then if the completion was the fully qualified name (i.e. std::io::Result) the client libraries (something like emacs-racer) could decide what to do with it.

I don't think the API need change at all, actually - the only issue is supporting auto-import insertion in ALL the platforms that interface with racer.

I think just HashMap is enough.

So, I can't quite remember what past me was thinking, but if you're looking for partial completions you can't just have a hash map (because 'R' would have a different hash than 'Re'). Unless you're thinking of loads of entries, one for each character in the module / type.

Storing in a list would also allow us to do slow (but probably workable) fuzzy completion, so Rslt could complete to Result. Obviously you could also do this by getting the keys of a hash map, but the cache would take a pretty big hit I expect, depending on how it's all arranged in memory.

tomc1998 avatar Jun 05 '18 16:06 tomc1998

I'm sorry my comment was ambiguous. To clarify my opinion, I'm for implementing caching for use items, but for grep-like completion you're proposing here, I'm not sure we should provide it. I'm big fun of ripgrep so I feel it interesting, but I think some users feel confusing about it.

You mean, checking whether something is exported publicly?

It's a bit difficult to explain, but put simply, racer judges if the item is visible regardless of where the user's cursor is. See #851 for example.

I don't think the API need change at all, actually

But if we changes completion way in current API, it's big API change and I think some users don't like it...

I think just HashMap is enough.

It's totally my misunderstanding so please ignore it.

kngwyu avatar Jun 05 '18 16:06 kngwyu

To clarify my opinion, I'm for implementing caching for use items, but for grep-like completion you're proposing here, I'm not sure we should provide it. I'm big fun of ripgrep so I feel it interesting, but I think some users feel confusing about it.

You mean, you're against non-fully-qualified completion? Have you used modern java IDEs? It seems to work really well with those, just a shame it uses 100gb to run (some fool obviously decided to write IDEs in java!)

It's a bit difficult to explain, but put simply, racer judges if the item is visible regardless of where the user's cursor is. See #851 for example.

Ohhhh I see, I don't think this is a huge issue, we can save this for another PR.

But if we changes completion way in current API, it's big API change and I think some users don't like it...

Yeah I'm totally for just putting this onto another endpoint, like racer complete-nonfq R

tomc1998 avatar Jun 05 '18 17:06 tomc1998

you're against non-fully-qualified completion?

I'm not against, but honestly I don't need it immediately. Now racer lacks important features like completion based on trait bounds, so I want to implement them at first. But, of course, if you open a PR I really welcome it.

Have you used modern java IDEs?

Yeah I have some(not many) experiences with IntelliJ Idea, but I don't know they support partial matching(sorry).

kngwyu avatar Jun 05 '18 17:06 kngwyu

The way the RLS does this is that rather than do eager import insertion, it offers a lightbulb to suggest inserting an import based on the compiler's suggestion for fixing an error. This relies on the compiler giving a good suggestion, but it usually does. This is not as smooth as auto-importing for some Java IDEs, but it is low cost and well-supported by the LSP (auto-imports would be harder to implement and doesn't fit too well with the LSP).

nrc avatar Jun 05 '18 22:06 nrc

@nrc Is this already in the RLS? Is there no point implementing this in Racer?

tomc1998 avatar Jun 06 '18 04:06 tomc1998

@tomc1998 applying a suggestion to add an import is implemented, yes. There might be something more that Racer (or the RLS) can do

nrc avatar Jun 06 '18 22:06 nrc