mpv icon indicating copy to clipboard operation
mpv copied to clipboard

player/lua/stats.lua: ellipsize filename

Open WhitePeter opened this issue 1 month ago • 13 comments

For the time being this is a POC to eventually fix/mitigate the display of long URLs in OSD stats, see https://github.com/mpv-player/mpv/issues/10975#issuecomment-3476747266.

This introduces a simple ellipsize function and an additional append attribute max_len to automagically filter the str argument. The hard-coded length of 32 chars was chosen for testing purposes. Ideally it can be dynamically changed depending on the available space but I haven't figured out yet if and how that can be determined. If it can't be done, providing a script option is easy enough.

In conjunction with #17021 this could be a fix for #10975.

P.S.: I chose to do this in stats.lua because it fixes the immediate issue and Lua comes some batteries included to make this easier and readable. But I am pondering, if extending the filename attribute getter is the better place to do this, i.e. a new subproperty /short? Anyway, for now this is an easy (partial) win.

  • [x] Read this before you submit this pull request: https://github.com/mpv-player/mpv/blob/master/DOCS/contribute.md

WhitePeter avatar Nov 14 '25 00:11 WhitePeter

stats already had this for terminal output and we removed it in a187110f4a. Clipping is already handled by \q2, which is accurate for proportional fonts, and ${term-clip-cc}, which considers multi-cell characters.

guidocella avatar Nov 14 '25 07:11 guidocella

stats already had this for terminal output and we removed it in a187110. Clipping is already handled by \q2, which is accurate for proportional fonts, and ${term-clip-cc}, which considers multi-cell characters.

But I don't like that the clipping is done at the end. I want something that inserts an ellipsis in the middle if at all possible. I feel, I may be getting somewhere with Lua, though. I want to try some things before I push. With some inspiration from keyname_cells and some digging in the Lua wiki, I think I can get something working. It won't be perfect, since that would be too expensive IMO, but I don't think it has to be for stats.lua. What's the worst that can happen? Some chars off-screen?

WhitePeter avatar Nov 14 '25 09:11 WhitePeter

Also, isn't stats.lua what's responsible for rendering the stats pages (default keys i/I)? How does terminal output figure in that? (edit: never mind, I see my mistake now; changed title to disambiguate) This is mainly about the File: field in stats page 1.

WhitePeter avatar Nov 14 '25 09:11 WhitePeter

That is not feasible with ASS. You can't know the text length before rendering short of using osd-overlay with compute_bounds everywhere. You are arbitrarily cutting a few characters completely ignoring the window width, font size and window resizing while the text is displayed. Why should users only see 32 characters on a 4k window with small font size?

It is also completely arbitrary do this for the stats filename and not any other OSD message. If anything this should be implemented in libass.

This also doesn't fix the linked issue which is about the output when printing tracks to the terminal, and has no relation to stats.

guidocella avatar Nov 14 '25 09:11 guidocella

Those 32 chars are just for testing, because my filename examples are rather short. It is meant to be configurable at least, eventually. I held off while this is still POC.

Re: text width, the underlying assumption is that the worst case is a monospace font. So if the text fits using that it sure does with proportional. Doesn't that make it easier to determine what max_len to aim for?

Anyway, if this cannot be done because of limitations with ASS rendering, I'd still like to have the second best option, some script-opt to set, say filename_max_len (opt-in: if unset, no change compared to prior versions).

Or just tell me if this is a dead end so I don't waste any more time.

WhitePeter avatar Nov 14 '25 10:11 WhitePeter

Re: text width, the underlying assumption is that the worst case is a monospace font. So if the text fits using that it sure does with proportional. Doesn't that make it easier to determine what max_len to aim for?

This is so wrong. There is no correspondence between a character's size in bytes and the width of the glyphs it generates. For example fullwidth Chinese characters usually occupy two ASCII characters worth of space but consist of more than two bytes. When dealing with codepoints which is what you should be doing at the very least, both of those occupy only one.

In the terminal this is handled by agreeing on a function for character width in cells that programs and terminal emulators use (see: wcwidth). This doesn't give perfect results and many programs/terminals don't do it properly but it works. I think this is what guido is talking about mpv implementing above.

In the GUI world I like to believe that we hold ourselves to high standards and do precise width calculations after shaping text with the appropriate fonts. This gets complicated when doing things like line-breaking (or inserting ellipses) though due to technicalities of text layout. This is why for it to be perfect this has to have work done by libass.

afishhh avatar Nov 14 '25 10:11 afishhh

Re: text width, the underlying assumption is that the worst case is a monospace font. So if the text fits using that it sure does with proportional. Doesn't that make it easier to determine what max_len to aim for?

This is so wrong.

Or just oversimplified?

There is no correspondence between a character's size in bytes and the width of the glyphs it generates. For example fullwidth Chinese characters usually occupy two ASCII characters worth of space but consist of more than two bytes.

I know as much. But knowing that a certain cluster of bytes will result in the equivalent of two chars width in a terminal (or monospace font) and knowing the font size should be enough for calculating the space required, no? And if one does use a monospace font, the result should fit perfectly in said space. If the font is proportional, all that's lost is some vacant space.

In the GUI world I like to believe that we hold ourselves to high standards and do precise width calculations after shaping text with the appropriate fonts.

But you don't, do you? I've just played a video file with a >200 char filename and File: in the stats page (i) just seems to render what doesn't fit off-screen. There is a half character at the rightmost edge suggesting so. And for such cases, i.e. when absolute URIs get returned verbatim by the filename property, I think it would be good to ellipsize. Doesn't have to be perfect but it will be an improvement to status quo.

This gets complicated when doing things like line-breaking (or inserting ellipses) though due to technicalities of text layout. This is why for it to be perfect this has to have work done by libass.

I think perfection can never be achieved anyway, so why not take something that will work in the vicinity of good enough and doesn't explode in case it misses the mark by some exotic unicode symbol or two. And if one is fine with status quo, don't opt-in once the option exists and you'll be none the wiser.

WhitePeter avatar Nov 14 '25 11:11 WhitePeter

I know as much. But knowing that a certain cluster of bytes will result in the equivalent of two chars width in a terminal (or monospace font) and knowing the font size should be enough for calculating the space required, no? And if one does use a monospace font, the result should fit perfectly in said space. If the font is proportional, all that's lost is some vacant space.

But you don't know the width of ASS output without compute_bounds.

But you don't, do you? I've just played a video file with a >200 char filename and File: in the stats page (i) just seems to render what doesn't fit off-screen. There is a half character at the rightmost edge suggesting so. And for such cases, i.e. when absolute URIs get returned verbatim by the filename property, I think it would be good to ellipsize. Doesn't have to be perfect but it will be an improvement to status quo.

It is worse because it cuts arbitrarily leaving empty space, while clipping used all available space at any window and font width, with no code required on our side.

I think perfection can never be achieved anyway, so why not take something that will work in the vicinity of good enough and doesn't explode in case it misses the mark by some exotic unicode symbol or two. And if one is fine with status quo, don't opt-in once the option exists and you'll be none the wiser.

It is not good enough to cut a fixed number of bytes. And it is still stupid to do for a single string in all of mpv.

guidocella avatar Nov 14 '25 11:11 guidocella

I know as much. But knowing that a certain cluster of bytes will result in the equivalent of two chars width in a terminal (or monospace font) and knowing the font size should be enough for calculating the space required, no? And if one does use a monospace font, the result should fit perfectly in said space. If the font is proportional, all that's lost is some vacant space.

But you don't know the width of ASS output without compute_bounds.

So the script-opt variant then.

But you don't, do you? I've just played a video file with a >200 char filename and File: in the stats page (i) just seems to render what doesn't fit off-screen. There is a half character at the rightmost edge suggesting so. And for such cases, i.e. when absolute URIs get returned verbatim by the filename property, I think it would be good to ellipsize. Doesn't have to be perfect but it will be an improvement to status quo.

It is worse because it cuts arbitrarily leaving empty space, while clipping used all available space at any window and font width, with no code required on our side.

But clipping has downsides too, someone else has pointed out in the linked issue, IIRC. Having start, end and '...' in the middle leaves more context, especially the file extension.

I think perfection can never be achieved anyway, so why not take something that will work in the vicinity of good enough and doesn't explode in case it misses the mark by some exotic unicode symbol or two. And if one is fine with status quo, don't opt-in once the option exists and you'll be none the wiser.

It is not good enough to cut a fixed number of bytes. And it is still stupid to do for a single string in all of mpv.

Then you can just leave said option alone (unset) and be golden. As I've said before, this is not necessarily exclusive to the file name. Any append caller can just set the attribute and be done with it. It's up to the caller to determine max_len beforehand, or the user to set a script-opt, which is not there yet.

WhitePeter avatar Nov 14 '25 13:11 WhitePeter

@guidocella Also, where is clipping even happening right now? Because, I couldn't find anything that seems to be doing it and the File: field in stats.lua page 1 definitely doesn't do it. The last thing on that line with a >200 char (pure ASCII) filename is half a char and no '..' in sight.

WhitePeter avatar Nov 14 '25 14:11 WhitePeter

I guess it doesn't actively clip ASS output, it just doesn't wrap if there are no spaces in the filename and it keeps going beyond the window. It does clip terminal output with term-clip-cc.

guidocella avatar Nov 14 '25 14:11 guidocella

Now I get it, the terminal thing. stats.lua works with --no-video as well, for instance. That's when term-clip-cc comes into play. But if I were to use that for OSD purposes the clipping would be done to terminal line length which is totally uncoupled from vo dimensions. So the way I see it, there is currently no way of clipping or ellipsizing in the OSD.

WhitePeter avatar Nov 14 '25 15:11 WhitePeter

ASS clipping is done with \q2, without ellipsis. See https://aegisub.org/docs/latest/ass_tags/

guidocella avatar Nov 14 '25 15:11 guidocella