screenshot: fix %f corner case
URLs can end with trailing slashes (/) which in turn results in GNU basename[1] returning the empty string. Catch that case and fallback on the raw value of mpctx->filename. Subsequent path sanitation takes care of translating invalid path component chars.
Issue reproduction steps:
mpv \
--screenshot-dir=$HOME/mpv-shots \
--screenshot-template='%f/%P' \
https://example.org/video/
This would result in %f expanding to '' and thus render screenshot-template an absolute path which, for some reason, would in turn take precedence of screenshot-dir and hence result in a non-writeable path, i.e. /timestamp.ext.
With this new approach the resulting path looks like this:
/home/user/mpv-shots/http:__example.org_video_/timestamp.ext
Not particularly pretty but less ambiguous and more consistent over a diverse range of input paths. For transparency and reference also see [2] which sadly could not be revived.
[1] https://github.com/mpv-player/mpv/pull/14635#issuecomment-2269456187 [2] https://github.com/mpv-player/mpv/pull/17015
Read this before you submit this pull request: https://github.com/mpv-player/mpv/blob/master/DOCS/contribute.md
Check!
I also changed the mp_basename to a (GNU) basename call, just for kicks. Which begs the question: is there a particular reason to re-implement it? Because from my POV, all the prerequisites for GNU base-/dirname seem to be met; source is _GNU_SOURCE and string.h is included virtually everywhere.
This just an aside, though. If you want mp_basename, I will comply. But this looks like an opportunity to trim the code base. If there is interest I might be willing to replace all the mp_(basename|dirname) calls; in a separate PR, of course.
According to DOCS/contribute.md GNU features should not be used and it may not work on Windows.
I am a little embarassed for missing that. I grepped for _GNU_SOURCE and saw it as a meson build default. Sorry!
Sorry for the last force push, had to rewrite history.
Download the artifacts for this pull request:
Windows
I've also changed mp_property_filename to use mp_basename_or_url, mainly because of consistency, but it does kill two birds with one stone in that stats.lua now shows full URLs. This may be surprising to some unsuspecting users. It also may make the issue of very long values more pressing, i.e. shortening them on display.
Why are you copying things from my PR? You are duplicating review work for maintainers. This should be closed and further comments should go in my PR.
Sorry, I did not mean to offend. I gave you credit and it is not a direct copy, because as you can see in my last commits mp_basename should not change behavior. It just needs to be used in the appropriate places, i.e. mp_splitext.
Sorry, had to squash some commits after discovering some subtle bugs in mp_splitext. It would've split the wrong portion, if there were more than one leading dots. And the check for / in the split portion did not account for Windows paths with \ separators. mp_basename usage solves that problem.
Also, the check for initial dots was in the wrong place, in case path was not already a basename.
Yeah I also realized mp_splitext has those issue, and also see https://github.com/mpv-player/mpv/pull/17022#issuecomment-3507827060. We can continue in this PR.
I suggest:
- squashing "screenshot: fix %f corner case" and "screenshot: use new wrapper for %f expansion"
- mentioning #10975 in the commit message of "command: use mp_basename_or_url for filename"
- squashing "command: use mp_basename_or_url for filename" and "DOCS/man/input.rst: Update filename property"
- splitting
mp_splitextfixes out of "path_utils: new mp_stripext shorthand" - squashing "screenshot: only strip ext from basename" and "screenshot: use mp_stripext for %F"
It's also worth noting that we also have bstr_strip_ext and bstr_get_ext but those don't check for URLs like mp_splitext so I'm not sure how we should handle all these similar functions?
I am having second thoughts on the multiple leading dots issue. I guess,
(Edit: I've just realized that it cannot happen because all initial dots get skipped; but it can happen with the old version)/path/to/...ext is technically a file named .. with ext as extension. So the root could end up as /path/to/... Used in the wrong place/context that could lead to unexpected directory traversal.
But I don't want to change anything else for the time being to not have yet another push. There is also a warning about incompatible pointer conversion I missed and want to shut up, just as a reminder to myself.
It's also worth noting that we also have
bstr_strip_extandbstr_get_extbut those don't check for URLs likemp_splitextso I'm not sure how we should handle all these similar functions?
I didn't know about them. I just saw mp_splitext used in the filename/no-ext property getter and figured to get rid of the duplicated effort. Maybe the case of handling URLs with those is covered by teaching them mp_basename?
But I am having a hard time wrapping my head around bstr usage. Can you give me or point me to a quick rundown, what purpose they server? I know from the bstring(3) manual that they are not necessarily \0-terminated. But that does not seem to explain why they might be useful in handling paths. Or is it, so one can have a long string and split on \0 like xargs --null?
Ideally there should only be one of each *{strip,split,get}ext, I guess. mp_splitext can do both in one call, get the extension and split it of. So maybe *_{get,strip}_ext can become shorthand wrappers around it?
See https://en.wikipedia.org/wiki/Null-terminated_string#Limitations and https://en.wikipedia.org/wiki/Null-terminated_string#Improvements
But using path_utils functions inside bstr.c is annoying because you have to allocate memory to convert it back to a C string, and bstr_strip_ext and bstr_get_ext calls don't benefit from treating URLs differently, so whatever I guess.
Another thing we can do is to replace mp_splitext with mp_strip_ext in video/out/vo_gpu_next.c, add mp_get_ext, and replace remaining mp_splitext calls with it, then make mp_split_ext static and only use it as the backend for the 2 new functions. (We should probably add the extra _ before ext in the names if we're changing these anyway).
While I am digesting your last digesting your last comment, @guidocella, an idea popped into my head. Could this perhaps be done in lua? I am asking because, I still don't quite like the duplication when expanding %f/F and %{filename[/no-ext]} and thought of writing a function just redirect those case to get the filename property, or maybe do a goto to the { case. But the interface to get a property seems rather involved, whereas its very ergonomic in the lua bindings. The important stuff is in place with the changed behavior of the filename property. And the other cases of % specifiers look like they are basically one get away in the lua interface. You know, do the nitty gritty with the nicer lua interface and then just issue the appropriate screenshot-to-file command.
So maybe this PR can serve as a stopgap measure to then port this stuff to lua, which I reckon to be way easier than doing this in C. It's not as if this is a hot code path.
I don't see how there is duplication. screenshot.c and retrieving properties are completely different. And properties are the API for clients running in different threads or processes, they are not meant to be used by mpv's core.
The cases %f/%F and %{filename[/no-ext]} are semantically identical but take two very different code paths which led to inconsistencies. I reckon this could all be outsourced to a Lua function that replaces the screenshot command with a wrapper around screenshot-to-file and remove the screenshot command from mpv's core. What looks clunky at times in C can be often times be expressed elegantly and more concisely in a higher level language such as Lua. And mpv's Lua interface is just great, so why not leverage it and get rid of code that does not need to be optimized to the last clock cycle.
There is nothing to outsource. Getting the filename is trivial. And core functionality should work in builds with Lua disabled.
And in said Lua version there will be one helper function that gets called in all the cases. Or maybe Lua can express cases better? Not sure, needs exploration. In Lua the %f would simply be mp.get('filename') is what I am trying to say. So all this being done in C kind of is code duplication, because someone provide a nice higher level interface to that functionality.
There is nothing to outsource. Getting the filename is trivial.
OK, but I was looking at that whole long switch case and especially %{ which reinvents mp.get.
And core functionality should work in builds with Lua disabled.
I think the core functionality can be boiled down to screenshot-to-file then? But I did not know that Lua was optional, I've figured it a hard dependency for a long time. Maybe it's time to make it official? There be benefits!
OK, but I was looking at that whole long
switchcaseand especially%{which reinventsmp.get.
No. Retrieving a single property is different from expanding property strings. And Lua functions are built on C functions, not the opposite.
I think the core functionality can be boiled down to
screenshot-to-filethen? But I did not know that Lua was optional, I've figured it a hard dependency for a long time. Maybe it's time to make it official? There be benefits!
Core functionality like playback and taking screenshots works fine without Lua, so it doesn't need to be required.
OK, but I was looking at that whole long
switchcaseand especially%{which reinventsmp.get.No. Retrieving a single property is different from expanding property strings. And Lua functions are built on C functions, not the opposite.
Yes, I know that. But why use the C functions with all that memory management overhead, i.e. dragging the talloc context along, and all that makes C difficult to deal with, when a simple mp.get will do? C is for performance critical code which this is not.
I think the core functionality can be boiled down to
screenshot-to-filethen? But I did not know that Lua was optional, I've figured it a hard dependency for a long time. Maybe it's time to make it official? There be benefits!Core functionality like playback and taking screenshots works fine without Lua, so it doesn't need to be required.
I am totally with you on playback, of course, but only half with screenshots. One can get 99% of screenshot with an external screenshooter - yes, I measured that. ;) Embedding playback context in the filename is very much what I'd call an extension. IIRC that's what the advertisement said, when mpv introduced the Lua interface: it's for extensions to the core functionality; screenshot-to-file being the glue between core functionality and higher level extensions in my (ideal) book.
But I don't want to argue. I am just trying to make a case for some code trimming to get a leaner meaner mpv core which really only provides media playback as best as can be done. Screenshots are media creation and as such not a show stopper if they did not exist at all as concept in mpv; there was, after all, a time before screenshot.
P.S.: Just think about it. I don't how big the audience of the thread is, but maybe others think differently. And if, in the end, we still disagree: no harm done.
Yes, I know that. But why use the C functions with all that memory management overhead, i.e. dragging the talloc context along, and all that makes C difficult to deal with, when a simple
mp.getwill do? C is for performance critical code which this is not.
It literally doesn't. I just explained that retrieving properties and expanding property strings are different. You are thinking of the expand-text command which calls the same mp_property_expand_string function as screenshot.c under the hood.
I am totally with you on playback, of course, but only half with screenshots. One can get 99% of
screenshotwith an external screenshooter - yes, I measured that. ;) Embedding playback context in the filename is very much what I'd call an extension. IIRC that's what the advertisement said, when mpv introduced the Lua interface: it's for extensions to the core functionality;screenshot-to-filebeing the glue between core functionality and higher level extensions in my (ideal) book.
That's fair if we were implementing screenshot for the first time but it's not worth porting existing working C code and break all custom user-configured screenshot bindings and likely introduce bugs for little advantage.
Yes, I know that. But why use the C functions with all that memory management overhead, i.e. dragging the talloc context along, and all that makes C difficult to deal with, when a simple
mp.getwill do? C is for performance critical code which this is not.It literally doesn't. I just explained that retrieving properties and expanding property strings are different. You are thinking of the
expand-textcommand which calls the samemp_property_expand_stringfunction asscreenshot.cunder the hood.
Don't get hung up on the mp.get; I oversimplified. And I was not thinking of expand-text. A different example, so we don't just talk about filename: case p can be done by f = f .. mp.get('playback-time'); and just like that the timestamp got appended to the filename. And look at all the overhead to get the same result in C. (I also smell a bug there because P is supposed to be with millisecond, yet there is no further distinction, so it's the same for p)
I am totally with you on playback, of course, but only half with screenshots. One can get 99% of
screenshotwith an external screenshooter - yes, I measured that. ;) Embedding playback context in the filename is very much what I'd call an extension. IIRC that's what the advertisement said, when mpv introduced the Lua interface: it's for extensions to the core functionality;screenshot-to-filebeing the glue between core functionality and higher level extensions in my (ideal) book.That's fair if we were implementing
screenshotfor the first time but it's not worth porting existing working C code and break all custom user-configured screenshot bindings and likely introduce bugs for little advantage.
Worse things have happened. I was surprised quite a few times by mpv dropping functionality from its core and changing options etc. Most of the time I could see why and made my peace with new reality otherwise. It's not like this has to be an explosion and I think you may be overestimating the impact. The simplest case would be to ask users to install Lua. Since options are exposed as properties, nothing else needs to change.
Just to be clear. I am not talking about getting rid of screenshot (the command). I am suggesting something like, say screenshot.lua to build the correct screenshot-to-file command, so it won't be gone, just a provider change. Most users are arguably running some distro provided version, when we're talking Linux where I reckon the most users. And since distros tend to enable everything there should be hardly any friction. Others may be building from source but the probably installed the build dependencies their distro used, e.g. apt build-dep mpv on a DEB based system. So they'd have to actively disable Lua afterwards. I've just checked, meson setup $builddir enables Lua, simply because the headers were found.
Don't get hung up on the
mp.get; I oversimplified. And I was not thinking ofexpand-text. A different example, so we don't just talk aboutfilename:case pcan be done byf = f .. mp.get('playback-time'); and just like that the timestamp got appended to the filename. And look at all the overhead to get the same result in C. (I also smell a bug there becausePis supposed to be with millisecond, yet there is no further distinction, so it's the same forp)
That's just 3 lines of typical boilerplate like everywhere in mpv's codebase, no crazy overhead for which everything must be rewritten.
Worse things have happened. I was surprised quite a few times by mpv dropping functionality from its core and changing options etc. Most of the time I could see why. It's not like this has to be an explosion and I think you may be overestimating the impact. The simplest case would be to ask users to install Lua. Since options are exposed as properties, nothing else needs to change.
Continuing to use the options for a script would be incorrect and lead to arguing, see https://github.com/mpv-player/mpv/pull/15655#discussion_r1903991092. Whereas changing them will be a breaking change, and script-bindings and script-opts are more verbose/less convenient to use than commands and options, so there's that too. I also don't think I'm overestimating the impact since users continuously complain about every change, and in fact specifically about screenshot options I remember that when we renamed --screenshot-directory to --screenshot-dir people complained about the warning and we had to change it from OPT_REPLACED to OPT_ALIAS.
2023-10-29 20:47:43 CounterPillow Warning: option --screenshot-directory was replaced with --screenshot-dir and might be removed in the future.
2023-10-29 20:47:43 CounterPillow wow thanks for pointlessly renaming options! I love the constant churn of shit being broken for no reason
2023-10-29 20:58:04 NRK the directory -> dir renames should've been aliased instead
2023-10-29 20:58:14 Dudemanguy they are
2023-10-29 20:58:22 NRK but they print warnings
2023-10-29 20:59:27 Dudemanguy NRK: well yes that's what the macro does
2023-10-29 21:00:42 Dudemanguy I guess there's OPT_CLI_ALIAS
2023-10-29 21:00:50 NRK Dudemanguy: i'm saying they should've been OPT_ALIAS instead. idk about watch-later-directory but screenshot-directory is probably very common.
Also for every added script there is the overhead of adding a manual section to document it, add boilerplate code in several files to load it, add and document an option to disable it, and have it constantly running in its own thread, though the overhead is minimal.
Just to be clear. I am not talking about getting rid of
screenshot(the command). I am suggesting something like, sayscreenshot.luato build the correctscreenshot-to-filecommand, so it won't be gone, just a provider change.
Please stop the ridiculous and off-topic ideas. screenshot can be either a command or a script binding. mpv's core can't even interact with scripts synchronously.
Message received.
One last thing I just have to say to this because it does demonstrate, that 99.9% can be achieved with plain screenshot-to-file and a key binding with Lua disabled.
Adding this to input.conf:
# interestingly enough ${screenshot-dir} does not work here
s async screenshot-to-file "~/tmp/mpv-shots/${filename}/${playback-time}.${screenshot-format}"
emulates my initial example (--screenshot-dir=~/tmp/mpv-shots --screenshot-template='%f/%P') almost perfectly, except I have not found a millisecond resolution property, but that's beside the point.
Now I shut up about the matter. I will go through the chore list and come back when I'm done. Thanks for all your insights.
${playback-time/full} has milliseconds. I guess the main advantage of --screenshot-template over that is %n.