lsix icon indicating copy to clipboard operation
lsix copied to clipboard

This is fun and i like it!

Open clort81 opened this issue 2 years ago • 46 comments

What's the fastest sixel viewer?

clort81 avatar Aug 27 '21 00:08 clort81

Good question! I'd love to have an answer to that on the lsix page.

I mostly use XTerm, the second slowest sixel interpreter I've ever seen, which is good enough for lsix. I can't recommend XTerm for general image viewing or animations -- although that's what I often use it for. mlterm and foot seem much faster.

For the Fastest Sixel Shooter in the West, I've seen some impressive sixel demos from @dankamongmen, so you may want to check out notcurses.

@J4James: any thoughts on a good sixel benchmark? It would be nice to have a good one for whatever test suite comes out of the vt340test.

I'm not sure if this will be helpful, but I wrote a gif viewer that measured frames per second by sending an inquiry to the terminal and presuming the image had finished rendering once it got a response.

hackerb9 avatar Aug 27 '21 16:08 hackerb9

While I appreciate the kind words from @hackerb9 , Notcurses's sixel implementation as of 2.3.x leaves some room for improvement. I'd say the current state-of-the-art is libsixel, which is (on many images) faster and more accurate than the homegrown Notcurses solution (with that said, Notcurses can handle some images that cause libsixel to fail). I don't expect this to remain the case for very long, but it's true at the moment. With regards to the Kitty graphics protocol, Notcurses is probably the most advanced.

I have some timing data here for ncplayer vs timg: https://github.com/dankamongmen/notcurses/discussions/1857

Note that there are some design tradeoffs that can affect raw benchmarks. WIth Kitty, you can either sideload a file or transmit its contents directly; the former is obviously faster, but only works on the local machine. You can use or not use compression; it slows down your image encoding, but can speed up use over a network (though probably not if you already use compression with OpenSSH). With Sixel, you can share palettes among multiple images, which cuts overhead but is obviously only a win when there are multiple images available. There's then the issue of startup time -- a program with high startup time can see it amortized over multiple loads, but will look bad if invoked once per image. Finally, how much parallelism is used? High parallelism might look great when run by itself, but not look so good when the machine is otherwise loaded. Etc, etc, etc.

With smaller images, it ought be entirely possible to drive hundreds of frames per second.

dankamongmen avatar Aug 27 '21 16:08 dankamongmen

it is interesting, for instance, to run ncls on a directory and also run lsix.

dankamongmen avatar Aug 27 '21 16:08 dankamongmen

oh if you were asking about fastest display as opposed to encoding, Sixel is fastest on WezTerm and foot in my experience. Kitty is fast on both Kitty and WezTerm, the only two implementations of which i am aware.

dankamongmen avatar Aug 27 '21 17:08 dankamongmen

@j4james: any thoughts on a good sixel benchmark? It would be nice to have a good one for whatever test suite comes out of the vt340test.

in addition to time performance, it would be awesome to compare quantized image with original. i've got a bug on this here: https://github.com/dankamongmen/notcurses/issues/1724

dankamongmen avatar Aug 27 '21 17:08 dankamongmen

I haven't really looked, but the only sixel benchmark I've come across was https://github.com/jerch/sixel-bench, which just measures the throughput of a video sequence that's been pre-encoded as sixel.

It's not really a good way to compare implementations though, because it doesn't account for the rendering frame rate. For example, if your renderer is the bottleneck, you can improve your throughput dramatically just by dropping 90% of the frames, but that's not what most people would consider an improvement.

That said, I do think it could still be useful for a terminal developer evaluating improvements in their own implementation, as long as they understand what they're actually measuring.

j4james avatar Aug 28 '21 00:08 j4james

Note: To avoid false high throughput artefacts by aggressive prebuffering, the script waits for a cursor position report sent from the terminal after the Sixel data.

@j4james nice touch!

dnkl avatar Aug 28 '21 10:08 dnkl

So, I just tried my ancient sixvid gif viewer on the sixel capable terminals that are packaged with the current Debian GNU/Linux (11, bullseye) and got some results that were not what I expected.

Terminal Frames per Second
foot 1.6.4 169 FPS
mlterm 3.9.0 240 FPS
XTerm(366) 223 FPS

hackerb9 avatar Aug 28 '21 22:08 hackerb9

Here's my sixvid gif viewer if you'd like to see relative speeds of sixel implementations on your own computers: https://github.com/hackerb9/sixvid

Usage: sixvid nyantocat.gif

(Hit the b key to toggle benchmarking mode).

hackerb9 avatar Aug 29 '21 01:08 hackerb9

So, I just tried my ancient sixvid gif viewer on the sixel capable terminals that are packaged with the current Debian GNU/Linux (11, bullseye) and got some results that were not what I expected.

Terminal Frames per Second foot 1.6.4 169 FPS mlterm 3.9.0 240 FPS XTerm(366) 223 FPS

interesting. here's ncplayer -bpixel -d0 ../data/notcursesIII.mkv -t0 -q:

term version pgeom cgeom times
mlterm 3.9.0 880x1406 80x61 1m5.142s 1m.3.731s 1m3.631s
xterm 368 880x1403 88x74 55.655s 55.433s 55.841s
alacritty 0.13.1 880x1400 88x70 55.716s 55.902s 55.709s

notcurses 2.3.17. were you running foot in a standalone Wayland, or xwayland? i'm not sure the latter would be a particularly fair comparison.

dankamongmen avatar Aug 29 '21 02:08 dankamongmen

if we look beyond sixel:

term version pgeom cgeom times
kitty 0.23.1 880x1440 88x70 33.179s 33.477s 33.385s
kitty 0.19.3 880x1440 88x70 54.722s 54.473s 54.867s

dankamongmen avatar Aug 29 '21 02:08 dankamongmen

i put these stats up at https://nick-black.com/dankwiki/index.php?title=Notcurses#Pixel_blitters. @kovidgoyal, i know you like this kind of thing =]

dankamongmen avatar Aug 29 '21 02:08 dankamongmen

Yeah thanks, and am happy to see no surprises there, and this is without using side band transmission even, I think?

kovidgoyal avatar Aug 29 '21 02:08 kovidgoyal

I couldn't get sixvid to work - the screen just went black. I don't know whether it's dependent on something I don't have installed. I didn't spend much time trying to figure it out, but will give it another go tomorrow. In the meantime, though, I have run sixel-bench on some of the terminals in my collection.

As I mentioned above, some will just drop frames (sometimes all frames), so when that's obviously the case I've just discounted them from the running (that includes Rxvt, St, and Yaft). I've also not included any Windows terminals, because I wanted to limit the competition to the same VM.

All of the ones below at least gave the appearance of rendering all the frames, so I think there's a reasonable chance they're competing fairly, but I wouldn't read too much into these results. Ordered from fastest to slowest:

Rank Terminal Time
1 VTE 4s
2 MLTerm 10s
3 Alacritty 21s
4 XTerm 23s
5 Contour 27s
6 WezTerm 159s
7 DomTerm 525s

I don't understand why WezTerm was so slow given the earlier praise it got from @dankamongmen - I did download the latest nightly to see if that made any difference, but no such luck. I don't know whether perhaps my use of a VM might effect some terminals more than others.

At any rate, VTE looks to me to be the winner by a long way. It's possible it gets that speed from not rendering all the frames, but it wasn't obviously doing anything like that. It just looked fast and smooth.

j4james avatar Aug 29 '21 02:08 j4james

were you running foot in a standalone Wayland, or xwayland? i'm not sure the latter would be a particularly fair comparison.

A good question and one I am gratified to be able to answer promptly: I do not know. It runs at the same speed when I unset the DISPLAY, if that means anything.

hackerb9 avatar Aug 29 '21 02:08 hackerb9

I couldn't get sixvid to work - the screen just went black.

Huh. I wonder if ImageMagick is calling ffmpeg even for GIF decoding.

hackerb9 avatar Aug 29 '21 02:08 hackerb9

j4james: my praise for WezTerm is for its kitty implementation. I recall it being unremarkable with sixel, though not as poor as your results would suggest. if you built from source, did you use "cargo build --release"? unoptimized rust is very very slow.

that VTE number shocks me. I didn't even think VTE implemented sixel?

dankamongmen avatar Aug 29 '21 03:08 dankamongmen

were you running foot in a standalone Wayland, or xwayland? i'm not sure the latter would be a particularly fair comparison.

A good question and one I am gratified to be able to answer promptly: I do not know. It runs at the same speed when I unset the DISPLAY, if that means anything.

so foot, AFAIK, is a wayland-only terminal. the others are mostly X, though a few can do both. I'm not sure how you would run foot in Xorg without running Xwayland, interesting.

dankamongmen avatar Aug 29 '21 03:08 dankamongmen

Yeah thanks, and am happy to see no surprises there, and this is without using side band transmission even, I think?

correct, Notcurses eschews zee sideband

dankamongmen avatar Aug 29 '21 03:08 dankamongmen

with that said, my sixel vs kitty numbers oughtn't be taken out of the Notcurses context. my sixel quantization algorithm, as I've mentioned, is not where I'd like it to be. with that said, the kitty 0.19.3 numbers are very compare to the best sixel numbers, but the kitty 0.23.1 runs crush them. this is due to both improvements in kitty's internals, improvements in the kitty protocol, and Notcurses taking advantage of those improvements.

with that said, I'm a big believer in the superiority of the kitty protocol in just about every sense, even (perhaps counter-intuitively) implementation complexity (no need to deal with heights that aren't multiples of 6, no need to avoid the bottom row, no need to quantize).

dankamongmen avatar Aug 29 '21 03:08 dankamongmen

if you built from source, did you use "cargo build --release"?

Nope, I just downloaded the nightly deb package from github and installed that. Many of the others were built from source, though, so that's definitely worth bearing in mind. A different compiler could easily make a difference to the performance. As I said, don't read too much into my results.

I didn't even think VTE implemented sixel?

Like XTerm, you've got to build it yourself with the appropriate option enabled. I don't think you'll find it in a released package anywhere.

no need to avoid the bottom row

Technically you shouldn't need to avoid the bottom row with sixel either - that's just a bug in most implementations.

j4james avatar Aug 29 '21 03:08 j4james

@j4james I've put in some sanity checks so hopefully you'll get an error message now instead of a blank screen.

hackerb9 avatar Aug 29 '21 03:08 hackerb9

Technically you shouldn't need to avoid the bottom row with sixel either - that's just a bug in most implementations.

@j4james just schooled me on that one this week. 😳 I had been complaining that my VT340 doesn't go to the next line at the end of sixels. Turns out that's a feature.

hackerb9 avatar Aug 29 '21 03:08 hackerb9

I didn't even think VTE implemented sixel? Like XTerm, you've got to build it yourself with the appropriate option enabled. I don't think you'll find it in a released package anywhere.

huh. that number seems very suspect, as i've found the total time to generally be dominated by transfer, not by actual rendering. i'm likewise surprised that mlterm could possibly score so high, as it's consistently been the slowest in my testing. very strange.

dankamongmen avatar Aug 29 '21 03:08 dankamongmen

@j4james just schooled me on that one this week. I had been complaining that my VT340 doesn't go to the next line at the end of sixels. Turns out that's a feature.

oh absolutely, i'd love for that to be the case. i have a good hundred lines of code devoted to dealing with this annoyance. kitty meanwhile has c=1 which means DON'T SCROLL DAMNIT and is a fine thing.

dankamongmen avatar Aug 29 '21 03:08 dankamongmen

oh absolutely, i'd love for that to be the case. i have a good hundred lines of code devoted to dealing with this annoyance. kitty meanwhile has c=1 which means DON'T SCROLL DAMNIT and is a fine thing.

meanwhile the linux framebuffer console keeps drawing kinda-but-not-completely distinct from text, so i have do all my image scrolling manually there =]

dankamongmen avatar Aug 29 '21 03:08 dankamongmen

@dankamongmen wrote:

I've found the total time to generally be dominated by transfer, not by actual rendering.

Interesting. I added a --shm option to sixvid to create the temporary sixel files to /dev/shm/$USER (if the machine has that filesystem mounted). Unsurprisingly, it doesn't give much of a speed boost since the sixel files are cached in memory after the first play through.

hackerb9 avatar Aug 29 '21 04:08 hackerb9

Yeah thanks, and am happy to see no surprises there, and this is without using side band transmission even, I think?

correct, Notcurses eschews zee sideband

Since, IIRC, times are dominated by encoding/transmission/decoding using the sideband should yield substantial improvements. Most well designed terminal emulators should not have rendering as a bottleneck.

kovidgoyal avatar Aug 29 '21 04:08 kovidgoyal

@dankamongmen writes:

I'm a big believer in the superiority of the kitty protocol in just about every sense, even (perhaps counter-intuitively) implementation complexity (no need to deal with heights that aren't multiples of 6, no need to avoid the bottom row, no need to quantize).

You'll find no argument. In fact, I think they may not even be comparable. It's not just that sixel is a protocol from 30 years ago. The kitty graphics protocol became a completely different type of thing once it added the notion of tagged images. Sixel splats bitmaps on the screen and forgets about them. A terminal that supports the kitty protocol must treat images as first class citizens, on the same level as text. That's a huge paradigm shift.

While I have some questions — how do sixel and kitty graphics interact? why can't text erase graphics? how do graphics interact with VT220 scrolling windows? do the images disappear as one would expect when switching text pages? does switching back making them reappear? why does loop=2 mean loop once, and loop=1 mean ∞? — I think kitty is the most probable future. @kovidgoyal has done a yeoman's job with the kitty graphics protocol and I look forward to it becoming standardized.

In the meantime, though, I'm having fun with sixels. ImageMagick understands them so I can do quick tests from the command line as I manipulate images or write tiny shell scripts like lsix. My preferred terminal, XTerm, has support builtin. I can browse the web using sixels in w3m. Even my humble DEC VT340 can display sixels, albeit at 9600 baud. Definitely not the future, but I like it.

hackerb9 avatar Aug 29 '21 05:08 hackerb9

On Sat, Aug 28, 2021 at 10:59:34PM -0700, hackerb9 wrote:

While I have some questions — _how do sixel and kitty graphics interact?

They dont :)

why can't text erase graphics?

Because graphics can be over or under text and really have an existence independent of text.

how do graphics interact with VT220 scrolling windows?

You mean margins? https://sw.kovidgoyal.net/kitty/graphics-protocol/#interaction-with-other-terminal-actions

do the images disappear as one would expect when switching text pages?

If you mean the alternate screen, yes the main and alternate screen maintain their own independent image lists.

does switching back making them reappear?

to the main screen yes.

why does loop=2 mean loop once, and loop=1 mean ∞?_

Because we are only allowed positive numbers and 0 means not specified. So 1 means infinite and numbers higher than that mean a finite number (n - 1).

kovidgoyal avatar Aug 29 '21 07:08 kovidgoyal

So, I just tried my ancient sixvid gif viewer on the sixel capable terminals that are packaged with the current Debian GNU/Linux (11, bullseye) and got some results that were not what I expected. Terminal Frames per Second foot 1.6.4 169 FPS mlterm 3.9.0 240 FPS XTerm(366) 223 FPS

This time I tried my sixvid script on a different video source (live action instead of an animated GIF) and the speeds reversed, with foot being the fastest.

hackerb9 avatar Aug 29 '21 08:08 hackerb9

If you mean the alternate screen, yes the main and alternate screen maintain their own independent image lists.

Similar, but not quite. I meant Page Memory, which is how sixel does double-buffering. You can choose which page to write on and which page to display and they don't have to be the same.

hackerb9 avatar Aug 29 '21 08:08 hackerb9

There's been some performance improvements to the sixel decoder in foot since 1.6.4. Still, sixvid with nyancat is much slower than I'd expect. This is something I'm going to want to look into.

(all benchmarks run on lousy laptop)

sixel-bench
foot 3.28s (36.13 MB/s)
mlterm 4.0s (29.62 MB/s)
xterm 61.14s (1.94 MB/s)
sixvid (nyancat)
foot 61FPS
XTerm 17FPS
MLTerm 118FPS

dnkl avatar Aug 29 '21 08:08 dnkl

@dnkl wrote:

sixvid (nyancat) foot 61FPS XTerm 17FPS MLTerm 118FPS

Uh. What happened to your XTerm? 17 frames per second on nyantocat? I just pulled out an old 32-bit Pentium M laptop (circa 2005) to test and I get better FPS than that. (Not by much, mind you. But, still...) Surely your laptop isn't more than fifteen years old, right?

hackerb9 avatar Aug 29 '21 10:08 hackerb9

Uh. What happened to your XTerm?

Good question... turned out to be this:

xterm*maxGraphicSize: 10000x10000

With the default maxGraphicSize I get 110 FPS.

dnkl avatar Aug 29 '21 10:08 dnkl

And with 1920x1080 (to match my monitor), I get just above 100 FPS.

dnkl avatar Aug 29 '21 10:08 dnkl

I took a quick look at the nyancat issue, but not really sure what's going on; none of the processes are using nowhere near 100% CPU.

My best guess atm is that sixvid is stalling on a full PTY pipe, since foot does not consume any PTY data while rendering. However, I would expect much higher CPU usage from foot if that was the case, as well as long rendering times. But most frames are rendered in less than 1ms.

I'll have to sit down one day and dig deeper into this. @hackerb9 thanks for a, what it looks like, very interesting benchmark :D

dnkl avatar Aug 29 '21 11:08 dnkl

On Sun, Aug 29, 2021 at 01:48:35AM -0700, hackerb9 wrote:

If you mean the alternate screen, yes the main and alternate screen maintain their own independent image lists.

Similar, but not quite. I meant Page Memory, which is how sixel does double-buffering. You can choose which page to write on and which page to display and they don't have to be the same.

kitty doesnt implement paged memory, it has a continuous scrollback buffer instead. Images are preserved in the scrollback, up to some limit which the spec leaves upto implementations but recommends that it be at least several screenful of images. kitty stores image data encrypted with a one time key on disk, so its limits are fairly generous, dont recall what they are off the top of my head.

The protocol doesnt talk about paged memory at all. If some terminal emulator wants to implement it and also the graphics protocol, I am happy to add to it.

The protocol includes ways to query the terminal emulator to check if an image is resident or not. And when the terminal emulators memory limits are being reached it is supposed to evict images that are the "oldest" aka furthest back in the scrollback.

kovidgoyal avatar Aug 29 '21 12:08 kovidgoyal

@j4james I've put in some sanity checks so hopefully you'll get an error message now instead of a blank screen.

Thanks - that helped a lot. It turns out I just didn't have ffmpeg installed on my test VM.

I'm not going to list specific numbers this time, because there was quite a lot of fluctuation in the frame rates, but the results were in the same ballpark as the sixel-bench test. VTE is still at the top, and WezTerm and DomTerm are still clearly at the bottom. The midfield were fairly close, but if I had to order them, I'd consider Alacritty and Contour the leaders of that group now, with Xterm and MLTerm bringing up the rear (which seems more in line with the results @dankamongmen was seeing).

That said, something that came up in the sixvid test that wasn't apparent with sixel-bench, is that Alacritty eats through memory like there's no tomorrow. After running sixvid for a minute or so it had chewed up several gigs of memory and died.

Sixvid also enabled me to get what I think was a fairer test of Rxvt, St, and Yaft, since they were now actually displaying all the frames (or at least appeared to be). The first two didn't do very well though - around the same speed as WezTerm - and Rxvt had the same memory-eating issue as Alacritty. However, Yaft's performance was fantastic - it seems about 50% faster than VTE even. It is a framebuffer terminal, so maybe that's a factor, and it's possible it's just not showing all the frames, but from a user point of view it looked great.

j4james avatar Aug 29 '21 15:08 j4james

there was quite a lot of fluctuation in the frame rates

Yeah, I saw that from some of the terminal emulators. I had expected that to only happen from videos with differing levels of compressibility, but not nyantocat. Anyhow, I've added a final FPS reading when you quit the program which will give you an overall frames per second, starting from the point where the decoding and sixelizing finished.

That said, something that came up in the sixvid test that wasn't apparent with sixel-bench, is that Alacritty eats through memory like there's no tomorrow. After running sixvid for a minute or so it had chewed up several gigs of memory and died.

Yipes!

Yaft's performance was fantastic - it seems about 50% faster than VTE even. It is a framebuffer terminal, so maybe that's a factor, and it's possible it's just not showing all the frames, but from a user point of view it looked great.

Wow. I had benchmarked yaft as plenty fast, but middling on my machine. Are you using any sort of special framebuffer in the VM that might affect the performance measurement? Yaft lacks a scroll back buffer, right?

hackerb9 avatar Aug 31 '21 07:08 hackerb9

Anyhow, I've added a final FPS reading when you quit the program which will give you an overall frames per second, starting from the point where the decoding and sixelizing finished.

That's brilliant. Thanks.

Are you using any sort of special framebuffer in the VM that might affect the performance measurement?

Not that I'm aware of. But maybe it's just that the GUI-based terminals are at a disadvantage running in the VM because they're not getting the video acceleration they would usually get. The results for WezTerm in particular seem hard to believe.

Also I know I can't run foot at all, because it depends on Wayland, which doesn't work in my VM. And perhaps if I did have Wayland that would give a performance boost to some terminals that are otherwise limited by X11.

Yaft lacks a scroll back buffer, right?

Yeah I think so. There's no concept of a scrollbar, and mouse wheel scrolling doesn't seem to do anything.

j4james avatar Aug 31 '21 12:08 j4james

Anyhow, I've added a final FPS reading when you quit the program which will give you an overall frames per second, starting from the point where the decoding and sixelizing finished.

That's brilliant. Thanks.

I've fixed the FPS calculation to take much less time to converge. Previously, you had to wait quite a while before it gave the correct answer.

Also I know I can't run foot at all, because it depends on Wayland, which doesn't work in my VM. And perhaps if I did have Wayland that would give a performance boost to some terminals that are otherwise limited by X11.

What is your VM set up? Perhaps I or someone else can replicate your results.

hackerb9 avatar Sep 05 '21 02:09 hackerb9

What is your VM set up? Perhaps I or someone else can replicate your results.

The VM is Hyper-V, running on Windows 10.0.18363.1500. It's got 2 virtual processor, and it's allocated 3GB of memory (but with the dynamic memory option enabled, so I think it can use to more than that). The guest OS is Ubuntu 20.04, but it wasn't installed as that initially (it's been upgraded a couple of time). I probably should try doing a clean install from the VM "quick create" setup, but I'm not convinced that'll make any difference.

j4james avatar Sep 05 '21 10:09 j4james

The VM is Hyper-V, running on Windows 10.0.18363.1500.

I can't replicate Hyper-V as I don't have Microsoft Windows. Do you have an old junker computer or laptop you could install Ubuntu onto? I think that'd make a bigger difference than trying to reinstall within Hyper-V.

Alternately, you could use a Live USB stick to try Ubuntu on the bare hardware without affecting your Microsoft Hyper-V installation.

hackerb9 avatar Sep 05 '21 18:09 hackerb9

Do you have an old junker computer or laptop you could install Ubuntu onto? Alternately, you could use a Live USB stick to try Ubuntu on the bare hardware without affecting your Microsoft Hyper-V installation.

Neither of those options are feasible for me, and I'm not that enthusiastic about performance testing anyway. I was just curious to get a general idea of the speed of other terminals to compare with my own crappy implementation. Personally I care more about the correctness of the Sixel emulation, and if I'm close to the middle of the range in performance, I'd consider that a win.

j4james avatar Sep 05 '21 19:09 j4james

I'm not that enthusiastic about performance testing anyway.

More than fair enough. I look forward to working with you more on correctness of sixel implementations.

hackerb9 avatar Sep 05 '21 21:09 hackerb9