CodeEditSourceEditor ✨ Language Server Protocol Integration

The Language Server Protocol defines a cross-editor protocol to obtain rich language features from an external source. Adding support for this to CodeEdit would make it far simpler to port language extensions from other IDEs (especially VSCode), making everyone's life easier :p

Mar 15 '22 17:03 heftymouse

I work on two libraries related to Language Server Protocol that could be helpful.

LanguageServerProtocol is low-level protocol support. LSP has a very large spec with a huge amount of types, not all of which map nicely to Swift. This lib wraps up all that.

LanguageClient is a much higher-level abstraction that supports client-server interactions, as well as transparent server restarting.

Now, if you are thinking of some kind of VS Code extension interoperability, I'm not sure how much these will help, as all the VSC stuff is implemented in JavaScript.

Mar 20 '22 12:03 mattmassicotte

Relevant links

https://microsoft.github.io/language-server-protocol/
https://docs.microsoft.com/en-us/visualstudio/extensibility/language-server-protocol?view=vs-2022
https://en.wikipedia.org/wiki/Language_Server_Protocol

Mar 25 '22 07:03 austincondiff

Where are we on this issue? What do we have left to call it done? It may be good to define this issues scope.

@lukepistrol does this tie into what you are working on in the new editor?

Jun 01 '22 22:06 austincondiff

@austincondiff I have yet to read into the LSP stuff. At some point I'm sure this would get implemented into CodeEditTextView though.

Jun 01 '22 22:06 lukepistrol

This issue is currently blocking #29.

Jun 28 '22 03:06 austincondiff

Language Server Protocol Research and Development

Abstract

The language server protocol is a protocol defining how a language can provide features to a client like auto complete, go to definition, find all references, and more. The protocol uses JSON-RPC to communicate between a client (the code editor) and a server. The communication layer is agnostic, it can use stdio, sockets, named pipes, etc. An example of an interaction is as follows:

The user executes "Go to Definition" on a symbol in the editor: The tool sends a 'textDocument/definition' request with two parameters: (1) the document URI and (2) the text position from where the Go to Definition request was initiated to the server. The server responds with the document URI and the position of the symbol's definition inside the document.

How the LSP works Microsoft Docs

Requirements

[ ] Base protocol (swift protocol) that defines basic requirements for all language servers (methods like starting/stopping the server, sending and receiving messages)
[ ] Language layer that conforms the base protocol and implements language specific features
[ ] Configurations per language (associated files types, launch path, server command arguments, etc.)
[ ] LSP features (ex. textDocument/completion, textDocument/definition, textDocument/references)
[ ] Be able to specify any communication protocol, like sockets or pipes. This is to allow remote communication to a language server (for example using CodeEdit on an iPad or a browser)
[ ] Async communication between multiple language servers, for when the user is editing files of 2...n different languages at the same time. Also handles starting / stopping the language server and catching errors / retrying.
[ ] User interface integration
[ ] User settings and customizations for language servers

Proposed Architecture Design

+----------------+      +-------------------+        +-------------------+
|    UI Layer    | <--> |    LSP Manager    |        | LSP Base Protocol |
+----------------+      +-------------------+        +-------------------+
                               |      ^                     ^      |
                               v      |                     |      |
                        +-------------------+               |      |
                        |    Concurrency    |               |      |
                        |      Control      |               |      |
                        +-------------------+               |      |
                               |      ^                     |      |
                               v      |                     |      v
                        +-------------------+        +-------------------+
                        |  LSP Extensions   |  <-->  | Language Specific |
                        |  (Python, JS,     |        |  LSP Classes      |
                        |     C++, etc.)    |        | (PythonLSP, JSLSP,|
                        +-------------------+        |  CPlusPlusLSP...) |
                               |     ^               +-------------------+
                               v     |
                       +---------------------+
                       | Communication Layer |
                       +---------------------+
                               ^     |
                               |     v
                        +-------------------+
                        |   Configuration   |
                        |     Management    |
                        +-------------------+

UI Layer: Components like context menus or pop-ups for features like auto-complete etc
LSP Manager: Manages the language servers, starts and stops them, and handles communication between requests to the correct language server, and handling server responses
Concurrency Control: This manages simultaneous communication with multiple language servers, when the user is editing files of different languages at the same time.
LSP Extensions: Language specific extensions that conform to the LSP Base Protocol and implement language specific features
Communication Layer: This layer handles the transmission of messages between the editor and the language servers. Supports multiple types of communication, like stdio, sockets, or named pipes, and should be able to be configured per language server.
Configuration Management: This component is responsible for managing configurations for each language server, like the associated file types, launch path, server command arguments, and user settings.

Please let me know your thoughts and if I'm missing any requirements.

Jul 19 '23 07:07 FastestMolasses

This is really great. Thank you for taking the time to think this through! One additional thing that we need to think through is how this fits within our extensions architecture.

@Wouter01 started the work on extensions and might be able to provide a little more clarity as to how he imagined LSP integration fitting into the work he has done, but essentially we need to provide extension developers with the ability to release language support extensions which includes LSP integration. Some of this may need to be done in CodeEditKit depending on how we decide to do this.

@CodeEditApp/maintainers I'd love to get everyones thoughts on @FastestMolasses's comments above and how this might fit into our extensions architecture!

Jul 19 '23 15:07 austincondiff

@FastestMolasses Thank you for doing this research into this. I should mention a few things that aren't apparent from the scope of this issue.

CodeEdit is sandboxed, which makes it harder to run a language server which may need to access things in the /bin folder or anywhere else on the user's machine. Read more here. We've decided to use extensions to host language servers, as our extensions can run in a non-sandbox context.
I'd point you to an open-source library that will make large parts of this faster to implement (Chime has made a lot of progress here, and we're hoping to contribute back to these packages as this is implemented): LanguageClient for hosting, connecting and using the language server protocol with Swift.
Hosting the language server from individual extensions makes your 3rd point easier, as all that's required is communicating with multiple extensions, and gives us an easier framework for your 6th point because we can have settings screens for extensions.
Extensions communicate back to the app via XPC, so we can send raw data or types that can be sent over XPC.

A consideration for CodeEditTextView (CETV) is handling syntax highlighting from multiple sources. I've designed the system to be able to hot-swap one highlight source right now, but we'll need to make it be able to use multiple to integrate with the LSP. Then, from CodeEdit's side. CodeEdit implements a HighlightProviding object that communicates with the extension for syntax highlights and feeds them back to CETV. The HighlightProviding methods are also already async so they can handle the relatively long wait it'll take to request and receive that information.

I think a method similar to that should be used for your 1st point. An object that conforms to something like DefinitionProvider (bad name but you get the gist) could communicate asynchronously to the extension or any other source like tree-sitter for the definition location and jump to it.

This all makes your diagram much simpler:

+----------------+      +--------------------+      +---------------+
|    UI Layer    |  ->  |  Extension Manager |  ->  |   Extension   | <- Also handles language-specific LSP extensions
+----------------+      +--------------------+      +---------------+
                                   |                         |
                        +----------------------+    +---------------------+
                        |  Concurrency Control |    | Communication Layer |
                        +----------------------+    +---------------------+
                                                            |
                                                    +--------------------+
                                                    | Hosted LSP Process |
                                                    +--------------------+

Jul 19 '23 15:07 thecoolwinter

Nice to see you picking this up! I managed to get a basic connection with a language server earlier this year (request+response to some command), but didn't go further than that due to time restrictions. I'd like to point to https://github.com/ChimeHQ/ChimeKit, which is a prime example of using language servers with extensionkit. If you have any questions about the work I've done on CodeEditKit, let me know

Jul 19 '23 15:07 Wouter01

CodeEdit is sandboxed, which makes it harder to run a language server which may need to access things in the /bin folder or anywhere else on the user's machine. Read more here. We've decided to use extensions to host language servers, as our extensions can run in a non-sandbox context.

Small but important thing, it's actually the opposite: CodeEdit is not sandboxed but extensions are sandboxed. This makes it harder to run language servers, but it's possible (Matt from Chime figured all of this out and his work is all nicely split up in swift packages)

Jul 19 '23 15:07 Wouter01

@mattmassicotte Would you mind shedding some light here?

Jul 19 '23 16:07 austincondiff

Maybe it's too early to say this, but it might be useful to have some way for CodeEdit to allow extensions in multiple languages. For example, if Typescript was supported, it might make it easier for VSCode extension developers to add an extension to CodeEdit.

Jul 19 '23 16:07 avasilic

Whew, how long do you have? 😅

I had to rewrite most of Chime's internals to incorporate LSP. Largely because I didn't understand how deeply LSP and the document/project model have to be connected. But, I did a bad job with it. So then I rewrote it again, but I did it before Swift concurrency was a thing. Then I introduced extensions + Concurrency, only I did that wrong. So, I'm now very close to fixing that, but it involved changing a huge amount of stuff.

I can share a few take-aways.

Be very careful with designing up-front. You cannot design a good system if you do not deeply understand the requirements. It took me ~ four tries to land on something reasonable. I have never been able to get a design right without first making something bad that I had to throw away.

The thing @Wouter01 mentioned, called ProcessService works, but will only pass App Review for certain applications. Apple needs to give you permission to have a unsandboxed XPC service, and you should plan on not getting that approval. LSP is fundamentally incompatible with sandboxing. I've considering proposing changes to the spec to support it. However, it would require quite a lot of work and changes from all servers and I just don't think that's feasible. I've been changing ChimeKit recently to use another approach. It may be possible to extract this into a library and I think that could be useful to many apps.

Designing an extension interface is difficult, and it scales (non-linearly I bet) with the number of features. The relationship between your LSP support and your extension interface will probably be very close. So, pretty hard to do one without the other.

I know zero about it, but I suspect you'll run into problems with your text view. Coordinating mutations both incoming and outgoing, as well as line number tracking, is required for LSP at an absolute minimum. I would encourage you think about text as a system, with the view focused on only the stuff views must do. I factored out Neon only after a lot of pain. And that's just one small part. I'm open to collaborating on line number tracking as a separate project, but I've made a few attempts at extracting my implementation and it is so intertwined in so much of the system I wasn't able to.

Oh yes, and of course, learn from my mistakes. If you use concurrency, you must turn on complete checking.

Jul 19 '23 18:07 mattmassicotte

@net-tech That is unrelated to this subject. We have an separate discussion, multiple even, for that.

@FastestMolasses As @thecoolwinter said, we should make use of proper libraries that can help speed things along, for example the ones that Matt has made.

Like @mattmassicotte said, designing an extension API is not easy to do. I've done plenty in VM based languages, like Java and C#, but doing it on macOS without a VM is more difficult to do safely. That is my experience at least. The use of ExtensionKit and ExtensionFoundation should help a bit with that tho.

Personally, I think that while LSP is important, the focus should be on an extension design first. That is difficult enough and LSP brings in an extra layer of complexities. I would spearhead it myself if I had the time for it, but sadly I don't...

Jul 19 '23 18:07 matthijseikelenboom

@matthijseikelenboom can you elaborate on the security concerns you have?

Jul 19 '23 19:07 mattmassicotte

@matthijseikelenboom I don't think any one person should be responsible for spearheading this. I don't think anyone has time to tackle this huge task that is before us alone. Even if they did, it is not a good idea to work independently on this without collaboration with the rest of the community. It needs to be a group effort.

After carefully agreeing together on an approach, it might be helpful to discuss each piece of work required to get this working and document what is agreed upon as a community. We should divide this into small pieces so no one single person does not feel like they need to tackle big chunks of work alone.

Jul 19 '23 19:07 austincondiff

Lots of great links and libraries. Will be going through all of these. I agree, fleshing out the extensions first is necessary. I'll take notes from the VSCode implementation and then study what we currently have and what we need.

Jul 19 '23 21:07 FastestMolasses

@mattmassicotte It's not so much security, but more the problems that could occur if you just load a bundle in Swift on macOS. It's the whole dealing with XPC connections and sandboxing. When using a VM, you don't have that problem. It's handled by the VM.

Jul 21 '23 15:07 matthijseikelenboom

@matthijseikelenboom oh interesting so when you load code into a Java/C# process that code does not have access to the full VM environment that loaded it? I didn't realize it provided that kind of isolation!

Jul 21 '23 15:07 mattmassicotte

@mattmassicotte Okay so, I went and look into it further, in Java's case. I asked some of my senior (for a lack of a better word) colleagues, and apparently I'm wrong 😅. It seems it's not default behavior, but it's achievable with OSGi. (In my defense, I thought it would be default behavior to load it in isolation, because it was explained in OCP. Assuming that is best practice)

Jul 21 '23 19:07 matthijseikelenboom

Ok super interesting, and thanks for following up @matthijseikelenboom! I know nothing about Java.

Jul 22 '23 11:07 mattmassicotte

CodeEditSourceEditor CodeEditSourceEditor copied to clipboard

✨ Language Server Protocol Integration

Relevant links

Abstract

Requirements

Proposed Architecture Design

CodeEditSourceEditor
CodeEditSourceEditor copied to clipboard