Add prompt expression for file size before preprocessing
In both v590 and v603-beta, less's prompt replaces %B with the size of the file after input preprocessing, rather than before. This is surprising behavior, as the file name (%f) explicitly uses the original input as a source.
Rather than counting the number of bytes after preprocessing, I feel it should count them before beforehand. I can't conceive of any technical barrier to doing so, if only as an additional command-line option.
My specific goal here is to replicate the default configuration of bat's header in less's prompt[^1]. The expression ?fFile\\: %f:STDIN.\nSize\\: ?B%B:-. accomplishes that in theory[^2], but the current design means that the file size includes every color code added by bat[^3].
If there is any interest in such a change, I may contribute it myself.
[^1]: I intend to move it to the header once that feature is released in a stable version. [^2]: I can't see a way to make specific sections of the prompt bold, or change its background color, but I can understand not making the formatting options too complicated.
At that point it'd probably be easier to add a prompt expression that moves the first N lines/bytes of processed input into the prompt, thereby making it the input preprocessor's problem.
[^3]: I am using bat as an input pipe for syntax highlighting.
%B expands to the size of the file as it appears to the user. Generally, the unprocessed file size is irrelevant, as that is not what is being viewed. For example, if a zip file is expanded by the preprocessor, the expanded size is what is relevant to the user; the size of the zip file itself is of no value.
The example of added color markup counting toward the file size is a more generic issue. It doesn't seem to me to be particularly closely related to preprocessing. For example, when you view a man page with color markup, the file size includes the color codes (as it should) even though the codes are not directly visible and no preprocessor is involved. Removing color codes from the count would in general be expensive, since it would require scanning the entire file. It is true that in the particular case that a preprocessor does nothing except add color codes, the original file size would give you the size without color codes, but I think that that is a rather uncommon thing for a preprocessor to do. I'm reluctant to add a feature for such a specialized use case.
Perhaps a more generally useful feature would be to add a prompt key that expands to a string from the beginning of the processed output, as you suggest, or from the environment. This is just off the top of my head; I'd need to think about this some more about whether it is a good solution.
Regarding prompt styling, you can include SGR escape sequences in the prompt if the -R flag is set. For example, this displays the filename in underline mode:
less -R '-Psfile is ^[[4m%f^[[m' filename
The two instances of ^[ represent a literal ESC character.
%B expands to the size of the file as it appears to the user. Generally, the unprocessed file size is irrelevant, as that is not what is being viewed. For example, if a zip file is expanded by the preprocessor, the expanded size is what is relevant to the user; the size of the zip file itself is of no value.
I see your point, and I agree it is probably worth preserving the existing behavior. That being said, I doubt I am the only one expecting the prompt to "ignore" the processing, and the documentation makes it clear that the prompt is meant to reflect the original file rather than the processed one to some extent:
You may define an "input preprocessor" for less. Before less opens a file, it first gives your input preprocessor a chance to modify the way the contents of the file are displayed. An input preprocessor is simply an executable program (or shell script), which writes the contents of the file to a different file, called the replacement file. The contents of the replacement file are then displayed in place of the contents of the original file. However, it will appear to the user as if the original file is opened; that is, less will display the original filename as the name of the current file.
%B Replaced by the size of the current input file.
%f Replaced by the name of the current input file.
Both the filename and size are properties of "the current input file", so it is odd that they diverge here. Even if you decide not to add or change anything in the program itself, I feel those descriptions should be tweaked to avoid referring to distinct subjects in the exact same way.
Removing color codes from the count would in general be expensive, since it would require scanning the entire file.
I do not think such a specific feature would be a good idea: it is trivial to think of other modifications a preprocessor would make. Whitespace formatting, for instance.
My understanding is that less currently pipes a file into an input pipe, then counts the bytes that get piped back out: it should therefore be easy to simply count the bytes that get piped in.
As for your ZIP file example, I'd usually be more concerned with the disk usage of a ZIP file than the expanded size. At any rate, if a new prompt expression was added for the unprocessed size (or retrofitted into %s, which currently has no clear reason for existing, as a breaking change), you could make the prompt show both sizes.
Clarified documentation in ff5ec01261dd9068394490d464f40a392554b591.