vere icon indicating copy to clipboard operation
vere copied to clipboard

Performance of calling socket worse than lens

Open guaraqe opened this issue 1 year ago • 9 comments

When doing similar operations (such as getting the ship's code or the output of vats) both by talking via HTTP with lens, or by running a thread via socket, the performance of the socket is consistently worse, being around 2x slower than lens. I checked, and when using the socket method, most of the time is spent by waiting for the output of recv, so this seems to be something internal to Urbit.

Is this performance difference expected?

guaraqe avatar Jul 10 '23 14:07 guaraqe

How did you go about benchmarking this? It would be great to reproduce this on my own machine as well.

matthew-levan avatar Jul 12 '23 19:07 matthew-levan

Click boots up an instance of vere to do the jaming and cueing so you're booting 2 ivory pills each time. Not sure if this is the cause.

mopfel-winrux avatar Jul 17 '23 23:07 mopfel-winrux

I did a benchmark in this repo (https://github.com/guaraqe/urbit-benchmark), these are the results:

[nix-shell:~/code/urbit/test-urbit/benchmark]$ hyperfine ./code-lens './code-click ../salsyp-samzod'
Benchmark 1: ./code-lens
  Time (mean ± σ):      32.2 ms ±   1.9 ms    [User: 2.2 ms, System: 2.6 ms]
  Range (min … max):    30.2 ms …  43.4 ms    66 runs
 
Benchmark 2: ./code-click ../salsyp-samzod
  Time (mean ± σ):     320.3 ms ±   5.6 ms    [User: 147.6 ms, System: 48.6 ms]
  Range (min … max):   315.5 ms … 333.4 ms    10 runs
 
Summary
  './code-lens' ran
    9.95 ± 0.62 times faster than './code-click ../salsyp-samzod'

The repo contains click, and two scripts, one running with click, another with lens. The first argument to code-click is the pier of the ship. The lens port is hardcoded in the corresponding script.

@mopfel-winrux I measured locally, and the calls to urbit take almost no time compared to the time interacting with the socket.

guaraqe avatar Jul 18 '23 02:07 guaraqe

@guaraqe Do you have the numbers on hand for the time just to boot up the transient instances of Vere to jam/cue the noun?

If not, it should be easily testable by writing a script to jam an atom using Vere, and directly pipe the result to another transient instance of Vere to cue it. We might also want to test with a non-trivial noun (e.g. an entire inline thread).

The above isn't directed at anyone in particular; just wanted to note down the idea for benchmarking time with transient Vere instances independent of the socket.

ashelkovnykov avatar Jul 18 '23 02:07 ashelkovnykov

I added a case with just calls the jam, called code-nothing:

[nix-shell:~/code/urbit/test-urbit/benchmark]$ hyperfine ./code-lens './code-click ../salsyp-samzod' './code-nothing ../salsyp-samzod'
Benchmark 1: ./code-lens
  Time (mean ± σ):      33.1 ms ±   1.6 ms    [User: 2.5 ms, System: 2.5 ms]
  Range (min … max):    31.3 ms …  40.5 ms    71 runs
 
Benchmark 2: ./code-click ../salsyp-samzod
  Time (mean ± σ):     330.6 ms ±   4.9 ms    [User: 156.3 ms, System: 49.8 ms]
  Range (min … max):   325.8 ms … 341.5 ms    10 runs
 
Benchmark 3: ./code-nothing ../salsyp-samzod
  Time (mean ± σ):     101.8 ms ±   2.2 ms    [User: 78.4 ms, System: 25.8 ms]
  Range (min … max):    99.3 ms … 109.8 ms    28 runs
 
Summary
  './code-lens' ran
    3.07 ± 0.16 times faster than './code-nothing ../salsyp-samzod'
    9.97 ± 0.50 times faster than './code-click ../salsyp-samzod'

If we subtract, that would be around 130ms for the socket, which is still 4x more. This difference is more visible for +vats, where the cost of calling the executable is smaller.

@ashelkovnykov I marked you in a private issue with more details.

guaraqe avatar Jul 18 '23 02:07 guaraqe

Theres also a cue that gets called which would take just as long as jam

mopfel-winrux avatar Jul 18 '23 12:07 mopfel-winrux

Theres also a cue that gets called which would take just as long as jam

Yeah, I removed it twice from the total to get to the 130ms.

guaraqe avatar Jul 18 '23 13:07 guaraqe

Isn't this just an account of the extra time required for compiling the threads?

matthew-levan avatar Jul 18 '23 13:07 matthew-levan

That sounds right. There's already a -code thread in %base, can you just invoke that as opposed to passing/eval'ing the source?

joemfb avatar Jul 18 '23 15:07 joemfb