%-flag inconsistencies
I recently submitted a pr to fix some of the inconsistencies with %-flags, however, as soon as I got into the code, I realized that the handing of %-flags is even more inconsistent than I thought, and incredibly manually implemented. In addition, much of it is not documented in the manual, e.g. %"f or the %#f that were recently discussed on the mailing list.
One issue is the inconsistency between %#f and %f. %f will fail if there are no input arguments, i.e. the substitution would be empty. However, for %#f, if you specify an argument number that is empty or doesn't exist, it just replaces with the empty string and keeps going. These seem like the opposite behavior, and I advocate for changing it. This change would be breaking, but it also seems seldom known and not often used.
On a more technical side, tup_printf functions by repeatedly using estring_append to build the tup string, but only in a few cases does it check if estring_append returns -1 to indicate that it couldn't reallocate the string, and in all cases when this happens tup_printf simply exists without signaling why. I don't see why this is done other than laziness or the return error code was added after much of tup_printf was written. My impression is that this should always be checked, but maybe I'm missing something.
Other strange choices include the fact that capitalizing a letter has the same effect in the two places it's done, but only those two. There are also several flags that only work in certain contexts (or are documented as such) but it's unclear why: e.g. why does %O only work if there is a single file? I can see a use case for such, but it doesn't seem like a requirement.
Ultimately what I propose is changing %flags to be more of a mini format language. As regex, why I think this should look like is %["']?([1-9][0-9]*)?[fFbBeoOiI] plus the %[%dg] which don't really fit this pattern.
- Adding
["']as it does now will add that quote around every file, but it currently only works for%["'][of]. - Adding
([1-9][0-9]*)?like it does now will select the input, order-only input, or output at that index. - Each letter does what it does now, the capitalized version does the same but with the extension stripped, e.g.
%Fwould include any directories but not the extension. The only letter that can't be capitalized is%Efor obvious reasons. - All of these will throw an error if the resulting substitution would be empty.
- They will also all work in any context where the data is available, e.g. %O won't fail if there is more than one output.
I'm happy to work on pr to consolidate and add this, but I wanted to see a) if other people thought it made sense, and b) to solicit feedback on what the appropriate result would look like. Some of this constitutes breaking changes depending on how you look at it. Does changing %#-flags to throw errors when the output would be empty really breaking as it seems like almost no one knows that it's a feature. Would changing %O to work when there are multiple outputs be breaking? I don't see why anyone would be relying on that failure, but it's potentially an issue.
It would be nice to be able to add modifiers to the flags like in make. eg $(@D) for the directory part of the output
%f, %b and %B are all variants of the inputs, where the output only has a single opton %o
Would be much nicer to have something like %f, %fb, %fd, %fe, where the last letter is a modifier to get the base/directory/extension part. Then this would apply to %o also as %o, %ob, %od and %oe.
And while at it (similar to #456) I really miss a flag (%oc maybe) to get the output relative to the directory where tup is executed from. So I can copy/paste that path. That makes it easier if you want to do some commands afterwards eg "ls" or "ar t" followed by pasted path. We use that a lot in our current make setup