theia icon indicating copy to clipboard operation
theia copied to clipboard

Use binary message RPC protocol for plugin API

Open tortmayr opened this issue 2 years ago • 13 comments

What it does

Refactors the plugin RPC protocol to make use of the new message-rpc introduced with #11011/#11228.

  • Refactor plugin-ext RpcProtocol API to reuse the new message-rpc protocol

    • Remove custom RPC message encoding and handling reuse message-rpc
    • Implement QueuingChannelMultiplexer that queues messages and sends them accumulated on the next process.tick (replaces the old Multiplexer implementation)
    • Refactors proxy handlers and remote target handlers
    • Use Channel instead of MessageConnection for creating new instances of RPCProtocol
    • Refactor RpcMessageEncoder/RpcMessageDecoder to enable overwritting of already registered value encoders/decoders.
    • Add mode property to base RpcProtocol to enable switching from a bidirectional RPC protocol to a client-only or server-only variant.
  • Implement special message encoders and decoders for the plugin communication. (Replacement for the old ObjectTransferrer JSON replacers/revivers)

  • Adapt HostedPluginServer and HostedPluginClient API to send/receive messages in binary format instead of strings. This enables direct writethrough of the binary messages received from the hosted plugin process.

  • Adapt hosted-plugin-process and plugin-host to directly send binary messages via IpcChannel/BinaryMessagePipe

  • Remove incorrect (and unused) notification proxy identifiers and instantiation

    • NotificationExt was instantiated in the main context
    • There were unused notification proxy identifiers for main and ext in the wrong contexts

Part of #10684 Fixes #9514

Co-authored-by: Lucas Koehler [email protected] Contributed on behalf of STMicroelectronics

How to test

For regression testing, any plugin loaded in the example applications can be used. Everything (i.e. interacting with and using the plugin should work as before.

For performance testing see comment below

Review checklist

Reminder for reviewers

tortmayr avatar Jun 06 '22 16:06 tortmayr

Performance Testing

This PR greatly improves the performance of FileSystem calls from within a plugin/VS Code extension. To test this with have used a simply VS Code extension that offers to commands to write/read file of different sample sizes to the workspace and logs the execution time in the console. The source code is available here: https://github.com/eclipsesource/theia/tree/json-rpc-performance-test-extension

Benchmarks

We used the extension to measure performance in browser and electron context on:

  • The last commit to master with the old RPC system (53a272e113890174c8b3049b8d1a74999dee0cd4)
  • The current master state (96a395099ec6c38432ba40022b6eb3575992d640)
  • The PR state
  • Local VS Code to establish a base line
Plugin File Write (Browser) 100 KB 1000 KB 10000 KB
With old RPC (#53a272) 60.23 ms 545.11 ms 4903 ms
With current master 50.83 ms 210.46 ms 2301 ms
With PR 42.23 ms 42.63 ms 158.46 ms
Plugin File Read (Browser) 100 KB 1000 KB 10000 KB
With old RPC (#53a272) 55.86 ms 546.54 ms 4970 ms
With current master 34.64 ms 287.36 ms 2945 ms
With PR 13.56 ms 20.03 ms 151.54ms
Plugin File Write (Electron) 100 KB 1000 KB 10000 KB
With old RPC (#53a272) 44.69 ms 430.86 ms 4274 ms
With current master 40.34 ms 211.51 ms 2024 ms
With PR 14.19 ms 41.91 ms 171.02 ms
Plugin File Read (Electron) 100 KB 1000 KB 10000 KB
With old RPC (#53a272) 46.98 ms 463.21 ms 4600 ms
With current master 40.12 ms 283.28 ms 2479 ms
With PR 16.62 ms 23.665 ms 153.41 ms
Local VS Code (Baseline) 100 KB 1000 KB 10000 KB
File Write 15.81 ms 30.52 ms 87.27 ms
File Read 8.1 ms 27.56 ms 80.2 ms

tortmayr avatar Jun 06 '22 16:06 tortmayr

I've done a little testing, and I noticed that I get a ton of cancellation error stack traces in the console

Uncaught Error: There is no document for breakpointinput:vs.editor.ICodeEditor%3A4

Error: There is no document for breakpointinput:vs.editor.ICodeEditor%3A4 at LinkProviderAdapter.provideLinks (C:\Users\thomas\code\chedev\theia\packages\plugin-ext\lib\plugin\languages\link-provider.js:31:35) at C:\Users\thomas\code\chedev\theia\packages\plugin-ext\lib\plugin\languages.js:287:97 at LanguagesExtImpl.withAdapter (C:\Users\thomas\code\chedev\theia\packages\plugin-ext\lib\plugin\languages.js:118:20) at LanguagesExtImpl.$provideDocumentLinks (C:\Users\thomas\code\chedev\theia\packages\plugin-ext\lib\plugin\languages.js:287:21) at C:\Users\thomas\code\chedev\theia\packages\plugin-ext\lib\common\proxy-handler.js:105:71 at processTicksAndRejections (node:internal/process/task_queues:96:5) at async RpcProtocol.handleRequest (C:\Users\thomas\code\chedev\theia\packages\core\lib\common\message-rpc\rpc-protocol.js:176:28) unexpectedErrorHandler errors.ts:26 setTimeout handler*ErrorHandler/this.unexpectedErrorHandler errors.ts:24 onUnexpectedExternalError errors.ts:67 onUnexpectedExternalError errors.ts:88 errors.ts:26:11 unexpectedErrorHandler errors.ts:26 (Async: setTimeout handler) unexpectedErrorHandler errors.ts:24 onUnexpectedExternalError errors.ts:67 onUnexpectedExternalError errors.ts:88

Monaco silently accepts cancellation if it detects that it is a cancellation. I think the proper layering would be to catch our own cancellation error and to translate it into a VS Code Cancellation Error at some appropriate place. see https://github.com/microsoft/vscode/blob/f52952332d14de7517cb22b002ef1352ecdee889/src/vs/base/common/errors.ts#L149

tsmaeder avatar Jun 08 '22 13:06 tsmaeder

Every once in a while, completion just seems to either fail or take a long time. I can't repeat nor do I have a handle on what's wrong yet.

tsmaeder avatar Jun 08 '22 13:06 tsmaeder

Turns out, the completion doesn't really fail, but it kind of looks like it's waiting for something: when I just let the cursor sit there, the completion does not complete. As soon as I move the cursor around, I get the completion.

tsmaeder avatar Jun 08 '22 15:06 tsmaeder

Thanks for testing this change @tsmaeder.

I've done a little testing, and I noticed that I get a ton of cancellation error stack traces in the console

Details Monaco silently accepts cancellation if it detects that it is a cancellation. I think the proper layering would be to catch our own cancellation error and to translate it into a VS Code Cancellation Error at some appropriate place. see https://github.com/microsoft/vscode/blob/f52952332d14de7517cb22b002ef1352ecdee889/src/vs/base/common/errors.ts#L149

Good catch! I double checked with the "old" plugin rpc protocol implementation and it indeed throws a VS Code Cancellation Error whereas the new protocol didn't: https://github.com/eclipse-theia/theia/blob/53a272e113890174c8b3049b8d1a74999dee0cd4/packages/plugin-ext/src/common/rpc-protocol.ts#L166

I have adapted the new protocol accordingly and we now should throw the expected Cancellation Error. In addition, I reverted another "over-optimiatization" that prematurely rejected the response of a canceled request.

I did a little testing and did not encounter cancellation error stack traces anymore.

Every once in a while, completion just seems to either fail or take a long time. I can't repeat nor do I have a handle on what's wrong yet.

Turns out, the completion doesn't really fail, but it kind of looks like it's waiting for something: when I just let the cursor sit there, the completion does not complete. As soon as I move the cursor around, I get the completion.

I could not observe this behavior anymore. Maybe it got fixed with the cancellation error fixes? Could you please retest this? Thanks!

tortmayr avatar Jun 09 '22 07:06 tortmayr

There is a problem with the completion in the PR: I've made two videos, first the behavior in master:

https://youtu.be/53T4Hvcw1rM

and now in the PR

https://youtu.be/sj8GVgzBC2k

Both videos are mad in master, on the source code of Theia and with the browser example, using Firefox. Completion seems slower and erratic in timing: sometimes leaving the cursor where you clicked before completion seems to block the completion somehow, but eventually the completion appears. To me it feels as if we were "crossing the streams". Is it possible that cancellations and multiple requests might be blocking or interfering with each other? If I look at what's changed with respect to master, what I'm seeing is that we're using a different multiplexing strategy for routing to multiple services and message batching has changed, as well. I think it would be good to have some configurable tracing in the code to find out what's going on. We just need to make sure we're not sending it over the wire with the remote log infrastructure again ;-)

tsmaeder avatar Jun 09 '22 09:06 tsmaeder

Some of this issues might be related to the behavior we have encountered in https://github.com/eclipse-theia/theia/issues/11249. Also the potential improvements discussed in https://github.com/eclipse-theia/theia/issues/11159. I will work on these two issues and then continue with cleaning up this PR.

tortmayr avatar Jun 14 '22 12:06 tortmayr

I wanted to try out this PR to run benchmarks, but plugins don't seem to work anymore. I am running Windows 10.

paul-marechal avatar Jun 22 '22 18:06 paul-marechal

@paul-marechal how did you resolve the conflicts? @tortmayr could you, please? ;-)

tsmaeder avatar Jun 23 '22 08:06 tsmaeder

Sure I can resolve the conflicts and retest it under Windows.

tortmayr avatar Jun 23 '22 14:06 tortmayr

@tsmaeder I just pulled the branch as is and built it, I didn't think I had to rebase myself?

paul-marechal avatar Jun 23 '22 14:06 paul-marechal

~~See https://github.com/eclipsesource/theia/pull/48 for a potential fix for Windows.~~

edit: Obsolete, see new comment.

paul-marechal avatar Jul 28 '22 15:07 paul-marechal

@paul-marechal Because @tortmayr is on vacation until the end of this week, I committed your suggestion. Thanks for that :)

lucas-koehler avatar Aug 16 '22 08:08 lucas-koehler

@tsmaeder I have rebased the PR and adapted to the latest changes (msgpack).

There is a problem with the completion in the PR: I've made two videos, first the behavior in master:

https://youtu.be/53T4Hvcw1rM

and now in the PR

https://youtu.be/sj8GVgzBC2k

Both videos are mad in master, on the source code of Theia and with the browser example, using Firefox. Completion seems slower and erratic in timing: sometimes leaving the cursor where you clicked before completion seems to block the completion somehow, but eventually the completion appears. To me it feels as if we were "crossing the streams". Is it possible that cancellations and multiple requests might be blocking or interfering with each other? If I look at what's changed with respect to master, what I'm seeing is that we're using a different multiplexing strategy for routing to multiple services and message batching has changed, as well. I think it would be good to have some configurable tracing in the code to find out what's going on. We just need to make sure we're not sending it over the wire with the remote log infrastructure again ;-)

The recent change to msgpack and other bugfixes seem to have resolved this issue. I can no longer reproduce it

@paul-marechal This PR now also includes your fix to ensure that the BinaryMessagePipe is working when using Windows.

tortmayr avatar Nov 02 '22 08:11 tortmayr

In testing I get the following error message in the log (with electron)

2
node.cjs:963 Uncaught (in promise) Error: unknown document
    at currentExtensions.<computed> (node.cjs:963)
    at read (node.cjs:403)
    at Array.readObject [as read] (node.cjs:496)
    at recordDefinition (node.cjs:956)
    at read (node.cjs:392)
    at checkedRead (node.cjs:193)
    at Packr.unpack (node.cjs:103)
    at Packr.decode (node.cjs:175)
    at MsgPackMessageDecoder.decode (rpc-message-encoder.ts:162)
    at MsgPackMessageDecoder.parse (rpc-message-encoder.ts:166)

tsmaeder avatar Nov 04 '22 12:11 tsmaeder

Just came across the same error in the command line log (probably as I was shutting down the electron version:

2022-11-04T13:04:24.534Z root ERROR [hosted-plugin: 9652] With stack trace: Error: Command with id 'extension.js-debug.setAutoAttachVariables' is not registered.
    at currentExtensions.<computed> (C:\Users\thomas\code\EclipseSource\theia\node_modules\msgpackr\dist\node.cjs:963:39)
    at read (C:\Users\thomas\code\EclipseSource\theia\node_modules\msgpackr\dist\node.cjs:403:15)
    at Array.readObject [as read] (C:\Users\thomas\code\EclipseSource\theia\node_modules\msgpackr\dist\node.cjs:496:18)
    at recordDefinition (C:\Users\thomas\code\EclipseSource\theia\node_modules\msgpackr\dist\node.cjs:956:19)
    at read (C:\Users\thomas\code\EclipseSource\theia\node_modules\msgpackr\dist\node.cjs:392:13)
    at checkedRead (C:\Users\thomas\code\EclipseSource\theia\node_modules\msgpackr\dist\node.cjs:193:13)
    at Packr.unpack (C:\Users\thomas\code\EclipseSource\theia\node_modules\msgpackr\dist\node.cjs:103:12)
    at Packr.decode (C:\Users\thomas\code\EclipseSource\theia\node_modules\msgpackr\dist\node.cjs:175:15)
    at MsgPackMessageDecoder.decode (C:\Users\thomas\code\EclipseSource\theia\packages\core\lib\common\message-rpc\rpc-message-encoder.js:84:29)
    at MsgPackMessageDecoder.parse (C:\Users\thomas\code\EclipseSource\theia\packages\core\lib\common\message-rpc\rpc-message-encoder.js:87:21)

If this is just the output of an error transferred through msgpackr, it's confusing.

tsmaeder avatar Nov 04 '22 13:11 tsmaeder

@tsmaeder Thanks for your fast review. I have addressed most of your comments with https://github.com/eclipse-theia/theia/commit/2b4016e8da50466f271bce2b2f808bf0ed371a10.

I'm now starting to investigate how to fix the confusing mgspackR error output

tortmayr avatar Nov 08 '22 17:11 tortmayr

@tortmayr I didn't see anything about the comments in https://github.com/eclipse-theia/theia/pull/11261#issuecomment-1303491276. What's the story there?

Otherwise, the code looks fine to me now.

tsmaeder avatar Nov 21 '22 12:11 tsmaeder

@tortmayr I didn't see anything about the comments in #11261 (comment). What's the story there?

Ah sorry, I forgot to add a comment regarding this. We need msgpackR extension for error encoding to keep the original stacktrace information (i.e. the pendant to https://github.com/eclipse-theia/theia/blob/32deea205d784831d385825e7e68f02f43cd2b7a/packages/plugin-ext/src/common/rpc-protocol.ts#L519 for the new RPC protocol) I'm going to add this once the discussion regarding the msgpackR extension mechanism is resolved (https://github.com/eclipse-theia/theia/pull/11261#discussion_r1023214572)

tortmayr avatar Nov 28 '22 14:11 tortmayr

@tsmaeder I pushed an update that improves the error encoding via msgpack and preserves the stack trace. Error output from plugins should now look as expected again.

e.g. trying to execute a non-existing command in a plugin would previously result in this message:

2022-11-29T08:18:18.440Z root ERROR [hosted-plugin: 449972] Promise rejection not handled in one second: Error: Command with id 'non.existing.command' is not registered. , reason: Error: Command with id 'non.existing.command' is not registered.
2022-11-29T08:18:18.444Z root ERROR [hosted-plugin: 449972] With stack trace: Error: Command with id 'non.existing.command' is not registered.
    at /home/tobias/Git/OpenSource/theia/theia/node_modules/msgpackr/dist/node.cjs:971:39
    at read (/home/tobias/Git/OpenSource/theia/theia/node_modules/msgpackr/dist/node.cjs:411:15)
    at Array.readObject [as read] (/home/tobias/Git/OpenSource/theia/theia/node_modules/msgpackr/dist/node.cjs:504:18)
    at recordDefinition (/home/tobias/Git/OpenSource/theia/theia/node_modules/msgpackr/dist/node.cjs:964:19)
    at read (/home/tobias/Git/OpenSource/theia/theia/node_modules/msgpackr/dist/node.cjs:400:13)
    at checkedRead (/home/tobias/Git/OpenSource/theia/theia/node_modules/msgpackr/dist/node.cjs:195:13)
    at Packr.unpack (/home/tobias/Git/OpenSource/theia/theia/node_modules/msgpackr/dist/node.cjs:105:12)
    at Packr.decode (/home/tobias/Git/OpenSource/theia/theia/node_modules/msgpackr/dist/node.cjs:177:15)
    at MsgPackMessageDecoder.decode (/home/tobias/Git/OpenSource/theia/theia/packages/core/lib/common/message-rpc/rpc-message-encoder.js:84:29)
    at MsgPackMessageDecoder.parse (/home/tobias/Git/OpenSource/theia/theia/packages/core/lib/common/message-rpc/rpc-message-encoder.js:87:21)

and now correctly preserves the stacktrace of the original error:

2022-11-29T08:23:58.196Z root ERROR [hosted-plugin: 455858] Promise rejection not handled in one second: Error: Command with id 'non.existing.command' is not registered. , reason: Error: Command with id 'non.existing.command' is not registered.
2022-11-29T08:23:58.197Z root ERROR [hosted-plugin: 455858] With stack trace: Error: Command with id 'non.existing.command' is not registered.
    at CommandRegistryMainImpl.$executeCommand (http://localhost:3000/packages_plugin-ext_lib_hosted_browser_hosted-plugin_js.js:1436:19)
    at http://localhost:3000/packages_plugin-ext_lib_common_plugin-api-rpc_js.js:955:71
    at async RpcProtocol.handleRequest (http://localhost:3000/bundle.js:142581:28)

tortmayr avatar Nov 29 '22 08:11 tortmayr

@tortmayr just a hint: I consider it good practice to keep updates to PR's in separate commits until ready to merge: it makes it possible just review the delta.

tsmaeder avatar Nov 29 '22 08:11 tsmaeder

@tortmayr just a hint: I consider it good practice to keep updates to PR's in separate commits until ready to merge: it makes it possible just review the delta.

Noted, thanks. Apparently I have interacted with too many gerrit based systems recently and developed the bad habit of rebasing/force pushing all changes into a single commit 😉

tortmayr avatar Nov 29 '22 09:11 tortmayr

Apparently I have interacted with too many gerrit based systems recently

It's a bit early for a flamewar ;-)

tsmaeder avatar Nov 29 '22 09:11 tsmaeder

@paul-marechal it looks to me your like your comments have been addressed and we're ready to merge this one?

tsmaeder avatar Nov 30 '22 09:11 tsmaeder

For anyone interested I did some benchmarking for plugins and frontend plugins to compare the file reading performance starting from the 1.26 Release:

Benchmarks

Test scenario: Read a file of size x from the workspace using the FileSystem plugin API. Median of 50 test runs.

  • Theia 1.26.0 -> Old json-rpc based protocol
  • Theia 1.32.0 -> Current release
  • PR -> latest state of this PR branch

Browser

Test context 100K 1000K 5000K 10000K
Plugin (1.26.0) 49 ms 347 ms 1964 ms 4090 ms
Plugin (1.32.0) 27 ms 165 ms 940 ms 2021 ms
Plugin (PR) 11 ms 18 ms 68.5 ms 120 ms
Frontend Plugin (1.32.0) 20 ms 140 ms 741 ms 1642.5 ms
Frontend Plugin(PR) 6.5 ms 10 ms 41 ms 90 ms

Electron

Test context 100K 1000K 5000K 10000K
Plugin (1.26.0) 49 ms 351 ms 2042 ms 4189 ms
Plugin (1.32.0) 40 ms 175.5 ms 969 ms 2018 ms
Plugin (PR) 13.5 ms 20 ms 77 ms 132 ms

VS Code

Test context 100K 1000K 5000K 10000K
Extension (1.73.1 local) 12 ms 15 ms 40 ms 87 ms
Web Extension (1.73.1 local) 6 ms 11 ms 34 ms 53.5 ms

tortmayr avatar Nov 30 '22 14:11 tortmayr

@tsmaeder @tortmayr I had trouble formulating what was bothering me but I think I got it:

I don't understand the whole rationale behind the msgpackr extension API:

  1. It stinks that the mspackr API for extensions is global to the whole process, because it means we cannot define sets of extensions to use per connection either. Already it makes me not want to rely on extensions, but I understand that it's the way it currently works.
  2. I don't understand what's special with ResponseError that requires a global extension to be installed. If the rationale is that it's a commonly used type, then what about other recurring types like URI? What's the difference between those types, where one requires an extension and not the other? I have the feeling we could do without msgpackr extensions completely.

Obligatory "it works right now" so if you are in a hurry and want to merge feel free to do so, but this is my only outstanding concern.

paul-marechal avatar Nov 30 '22 16:11 paul-marechal

Lastly I also am not a fan of the class + static member way of creating JS singletons as it's more of a Java way. Why not just exposing a global function like msgpackr does?

paul-marechal avatar Nov 30 '22 16:11 paul-marechal

  1. It stinks that the mspackr API for extensions is global to the whole process, because it means we cannot define sets of extensions to use per connection either. Already it makes me not want to rely on extensions, but I understand that it's the way it currently works.

Totally agreed. This global extension mechanism is less than optimal. However, it's currently the only way to customize the encoding of certain types. We can try and raise this issue in the msgpackR repo. From what I could see up until now maintainers are quite open for feedback and suggestions so we might be able to get a fix for this in the upstream repo. We should tackle this with a follow up issue.

2. I don't understand what's special with ResponseError that requires a global extension to be installed. If the rationale is that it's a commonly used type, then what about other recurring types like URI? What's the difference between those types, where one requires an extension and not the other? I have the feeling we could do without msgpackr extensions completel

The ResponseError is currently used in the proxy-factory to serialize and derserialize Application Errors. Some services have a special handling for these Application Errors (debug service, remote file system provider, task protocol ...). So we have to make sure the the type information for ResponseErrors doesn't get lost when sending the over the wire. Otherwise all services that rely on ApplicationErrors no longer work as expected.

I guess this dependency on ResponseErrors is a remainder from the original vscode-jsonrpc protocol and we should probably refactor the code to remove it. IMO this is also something that can easily be handled in a follow-up task. Nevertheless, I will take a look on it. If it can be removed rather easily without any side effects I will do it right away.

If the rationale is that it's a commonly used type, then what about other recurring types like URI? What's the difference between those types, where one requires an extension and not the other?

They do require a custom extension. We currently register custom extension at two places: https://github.com/eclipsesource/theia/blob/7927546f39d411dd2e01a9af99be7add2d9a6859/packages/core/src/common/message-rpc/rpc-message-encoder.ts#L173 and https://github.com/eclipsesource/theia/blob/7927546f39d411dd2e01a9af99be7add2d9a6859/packages/plugin-ext/src/common/rpc-protocol.ts#L224

In a nutshell, we have three topics that require custom encoding/decoding which can currently only be done (efficiently) with msgpackr extensions:

  • Preserving error stack traces when sending them over the wire (see also https://github.com/eclipse-theia/theia/pull/11261#issuecomment-1330265268)
  • Preserve type information for RepsonseErrors to ensure that the error handling derived from the old json-rpc protocol doesn't break
  • The old plugin-rpc implementation had JSON replacers/revivers in place to handle the serialization of the common types you just mentioned (uri,vscode uri, range, BinaryBuffer). This is now also handled via msgpackR extensions.

Lastly I also am not a fan of the class + static member way of creating JS singletons as it's more of a Java way. Why not just exposing a global function like msgpackr does?

I agree, a global function is probably the lesser evil here.

tortmayr avatar Nov 30 '22 17:11 tortmayr

IMO, we're ready to merge this one:

  1. msgpackr only supports global extensions. That sucks, but what can we do except fix msgpackr? https://github.com/kriszyp/msgpackr/issues/93

  2. The rest of the objections seem purely style issues, and those should not hold up merging of a PR. It's up to @tortmayr, though. (https://github.com/eclipse-theia/theia/blob/master/doc/pull-requests.md#requesting-changes)

tsmaeder avatar Dec 01 '22 09:12 tsmaeder

I would be in favor of merging this asap. This will give us the chance to fix potential bugs before the next release. The rest of the objections can be easily addressed with a follow-up PR.

tortmayr avatar Dec 01 '22 10:12 tortmayr