Noticeably slower startup compared to wezterm, foot
On Wayland/Debian trixie/Home-Manager-nixpkgs-unstable the startup is noticeably slower (i.e. from starting rio to seeing a fully rendered terminal).
A naive timing test shows a difference of about 400ms against foot and 320ms against wezterm:
❯ time rio -e sleep 1
________________________________________________________
Executed in 1.46 secs fish external
usr time 117.53 millis 2.28 millis 115.25 millis
sys time 368.49 millis 0.39 millis 368.10 millis
❯ time wezterm start --always-new-process sleep 1
________________________________________________________
Executed in 1.12 secs fish external
usr time 131.66 millis 2.65 millis 129.00 millis
sys time 106.06 millis 0.47 millis 105.59 millis
❯ time foot sleep 1
________________________________________________________
Executed in 1.08 secs fish external
usr time 138.36 millis 2.61 millis 135.75 millis
sys time 78.97 millis 0.00 millis 78.97 millis
The numbers are more or less consistent between multiple runs.
❯ rio --version
rioterm 0.2.30
Hi @jficz thanks for the issue, yes it's related to #1339 . I need to take a better approach for it
@jficz could u test with main branch?
I asked the #1339 to test 0.2.33 and looks is gone, closing this for now but lemme know anything
tested with rio v2.33 (built from your flake) and I don't really see any difference:
❯ time /nix/store/fap6nr9n94rp6v7l2hzksg6xjd99v09n-rio/bin/rio -e sleep 1
________________________________________________________
Executed in 1.47 secs fish external
usr time 96.06 millis 1.20 millis 94.86 millis
sys time 385.26 millis 0.61 millis 384.65 millis
❯ time /nix/store/fap6nr9n94rp6v7l2hzksg6xjd99v09n-rio/bin/rio --version
rioterm 0.2.33
If anything, it "feels" slightly worse and more random - in some cases I got as much as 5 seconds once without any obvious reason, 2-3 seconds were also not uncommon with a couple of consecutive tests.
Again, my testing is pretty naive - if you know about a more objective test, I'm happy to oblige.
Ah then is my bad I assumed it was same problem as the font taking forever to load #1339
I can try to profile the start but I'm not sure how to do that - strace seems like an overkill and I'm reasonably sure there is a better way to do this in Rust. Perhaps a special build with some profiling patches, a crate that includes this or something? My understanding of Rust is pretty much nonexistent in this area but I would really like to help :(
so I dug a bit deeper (in a layman-ish way) and found out a bit more.
Ran strace -Ttt doing rio -e true and its wezterm and foot equivalents
and a perf flamegraph with cargo-flamegraph
The flamegraph suggest most of the time spent on is filesystem operations. Specifically, slightly more than 50 % of the time spent is on read operations.
strace shows these facts:
-
rio does a lot more stuff than either wezterm or foot:
❯ wc -l *log 2887 foot.log 3745 wezterm.log 9318 rio.log -
rio tries to open a lot more files than either "competitor"
❯ grep openat *log | cut -d\( -f1 | sort | uniq -c 796 foot.log:openat 1101 wezterm.log:openat 1784 rio.log:openat -
of which most are nonexistent (for all terms)
❯ grep openat *log | grep ENOENT | cut -d\( -f1 | sort | uniq -c 547 foot.log:openat 757 wezterm.log:openat 1177 rio.log:openat -
the existing files are mostly font-related
❯ grep openat *log | grep -v ENOENT | grep font | cut -d\( -f1 | sort | uniq -c 217 foot.log:openat 242 wezterm.log:openat 376 rio.log:openat -
rio tries to open many files multiple times, as opposed to foot, which opens each file only once. These gists show all files opened >=3 times by both wezterm and rio. Rio is much worse:
- wezterm: https://gist.github.com/jficz/22199f4be460ddfdec7792bff7a232a2
- rio: https://gist.github.com/jficz/2e2398b98eb636a963fcbc1f12d052ab
-
rio does a lot more reads than others:
❯ grep '^read(' *log|cut -d\( -f1|sort | uniq -c 288 foot.log:read 370 wezterm.log:read 1025 rio.log:read -
some reads take as much as 15-25ms. The read time distribution is roughly:
- 85 % <1ms (amounting to ~40ms in total)
- 14.5 % 1-7ms (amounting to ~422ms in total)
- 0.5 % 15-25ms (amounting to ~75ms in total) these numbers are worse than in reality, strace slows things down a bit, but it gives an idea in relative terms
I don't know what the takeaways are here tho, except that Rio is obviously doing a lot more stuff than
the other two terms and that reads take 50 % of the startup time (including the exec of true and cleanup)
off, thank you so much for investigating this. will try to look on my linux