unicode hyphen in man page instead of ASCII hyphen-minus
If you copy a flag from the man page, it won't work, as the man page is wrongly formatted and uses Unicode hyphens ‐ (U+2010) instead of the ASCII hyphen-minus - (U+002D).
Step to reproduce bug:
- Open 2 terminals.
- Run
man scrotin one. - Type "scrot" in the other and wait before doing anything else.
- Copy any options from the manpage to the other terminal. I do this very often. I will use
‐‐selectin this example. - scrot will save the screenshot as "‐‐select" instead of running interactively as the "-" character is not the one on your keyboard but a Unicode one.
Aftermath:
ordici@chuwi-OpenSUSE:~/test
> scrot ‐‐select
ordici@chuwi-OpenSUSE:~/test
> ls
‐‐select
The man page itself does not have the Unicode character in it, but after comparing it to another man page, I saw that the other man page used the escape character "\" before each "-" and adding some to the scrot man page and opening it with man -l did fix the problem.
Therefore, here is a hacky way to fix the problem by adding a substitution command to create-man.sh:
sed -i 's/-/\\-/g' scrot.1
Now, this is not as good as it could be, as it also replace "-" that are non command related, as "scrot-1.10" from the bottom of the page, or
NAME
scrot - command line screen capture utility
Other man page keep these as the Unicode hyphen, but it is still a lot better than what we got currently. Trying to figure out why my commands did not work took me a long time, when it was a frustrating indistinguishable imposter character.
Apparently, groff recently stopped fixing the issue of broken man pages and reverted to original AT&T behaviour and this is the reason why this bug is only now being recognized: https://lwn.net/Articles/947941/
The scrot manpage gets generated by txt2man. Manually adding \ in front of a hyphen in man/scrot.txt does not work correctly. The resulting manpage gets severely messed up:
Applying some sed post-processing in man/create-man.sh to turn each - into \- works:
function create-man {
-txt2man -d "$T2M_DATE" -t $T2M_NAME -r $T2M_NAME-$T2M_VERSION -s $T2M_LEVEL -v "$T2M_DESC" $T2M_NAME.txt > $T2M_NAME.$T2M_LEVEL
+txt2man -d "$T2M_DATE" -t $T2M_NAME -r $T2M_NAME-$T2M_VERSION -s $T2M_LEVEL -v "$T2M_DESC" $T2M_NAME.txt | sed 's|-|\\-|g' > $T2M_NAME.$T2M_LEVEL
}
This will of course turn every - into an ascii hyphen. But IMO that's fine. Unicode hyphen doesn't have any practical purpose in manpages anyways.
What I'm wondering now is whether this change should be made on our end or if this is something that should be done in txt2man instead.
Ping: @eribertomota