diff-so-fancy
diff-so-fancy copied to clipboard
colors: consistent color parsing and configuration
This replaces both of the diff-so-fancy's hand-rolled color reconstruction mechanisms with cached invocations of git config --get-color, which is fast and uses git's own parsing to construct the appropriate ANSI sequences.
Those invocations are cached on the config key (if there is one) or the color spec.
This does not introduce noticeable performance impact: timing git -c pager.log="${dsf}" log -p on this repo, between this branch and 1.4.2, actually shows a slight speed-up.
The commits in this PR progress as follows:
- Naïve work fixing the
colorsubroutine; then realizinggit_ansi_coloris used alongside it, and attempting to fix that too. - (^ Those commits are left for reference, but were made redundant).
- Realizing that they should be consolidated, and should be using
git config --get-coloranyway instead of just mimicking it; and then proceeding toward some refactors to allow those to be consolidated.
Along the way, there are a couple fixes to the default/fallback handling. The last couple commits use some of the newly-exposed features (e.g. italic/strike/dim) for some improved defaults.
Background
diff-so-fancy introduced three mutually incompatible syntaxes for parsing color specs:
git's parsing, which is used to drive the original coloredgit diffoutput, and by thecontrib/diff-highlightpager bundled with git. That syntax is likewhite brightred bold ul reverseand supports the 8 standard ANSI names (black/red/green/yellow/blue/magenta/cyan/white) +bright…variants; ANSI colors like124, and hex like#ffcccc. It also supportsbold,dim,italic,ul,blink,reverse, andstrikeattributes and their negations (noboldetc).colors()which interpreted syntax likewhite_bold_on_black; but usedinverseinstead ofreverse, usedunderlineinstead oful; did not support thedimorstrikeattributes; did not support attribute negation; and did not understandbright…names. It also arbitrarily maintained two additional non-standard color names (orange and purple).git_ansi_colors()which attempted to more-closely mimicgit's parsing, but only supportedboldandreverse. (Nodim,italic,underline,blink, orstrike; no negations).
This was frustrating, because there are overlapping config keys between diff-so-fancy, diff-highlight, and git diff, and the lowest-common-denominator configuration was a little tricky and too limited.
While trying to fix a crash related to the reverse ↔︎ inverse mismatch—which in retrospect was a red herring, since color() doesn't appear to be invoked on user config…but whatever!—it became clear that this complexity was unnecessary and likely less performant than letting git itself—which is already a dependency—do the work.
Whoa... there is a LOT going on here. Lots of files and lines changed. My first reaction is that these are some pretty hefty changes that put a decent developmental burden on developers to update/maintain.
What exact problem is this solving?
We purposely kept the color parsing simple (no more than three octets), and it has generated little or no bug reports. All color parsing is done in Perl save the one call to git --config, so it should be speedy. See QuickSilver release. We purposely did not implement every ANSI feature that Git supports. If there is one you'd like to add I would be open that, but I believe we already support the configs of 99% of users.
Also, please see the third_party/cli_bench/cli_bench.pl script I wrote to test for speed changes between next and your branch. Perhaps that can illuminate and potential speed changes.
A quick comparison of next versus this PR.
next
:perl third_party/cli_bench/cli_bench.pl /tmp/diff-so-fancy-next < /tmp/raw.diff
...................................................
23 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (35.0%)
24 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (52.5%)
25 ms: %%%%%%%%%%%%%%%%%%%% (12.5%)
Ran '/tmp/diff-so-fancy-next' 50 times with average completion time of 23 ms
rwe
:perl third_party/cli_bench/cli_bench.pl /tmp/diff-so-fancy-rwe < /tmp/raw.diff
...................................................
39 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (20.0%)
40 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (40.0%)
41 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (40.0%)
Ran '/tmp/diff-so-fancy-rwe' 50 times with average completion time of 40 ms
23 ms vs 40 ms is a slowdown of about 74%. One of our primary goals has been to keep the startup (and ultimately runtime) of this script as low as possible.
@scottchiefbaker — Thanks for the input and additional measurements.
The timing isn't close to what I'd measured, so I think something else must be going on. Note that the benchmark you ran is only consuming the diff during the first run; after that stdin will be empty and no diff is run at all.
But…you're right: running the above with quoted args ('./diff-so-fancy < raw.diff') shows a slowdown between next 1a0ad54d at 425ms to this branch (currently cb73fa04) at 476ms, so I'll dig into what can be done about it.
Re maintenance: the intention was the opposite, actually. The change removes the need to maintain that parsing logic at all, while at the same time ensuring that the capabilities track to at least what git supports. The commit stream might be a bit too fine-grained, but if you look at the diff, there are broadly two simple changes:
- The default values got moved into maps — just a simplification to maintain single-source-of truth to make caching easier.
- The color parsing is accomplished with
git config --get-color $config $color.
@rwe oh good catch. I forgot to put quotes around my commands.
next
:perl third_party/cli_bench/cli_bench.pl '/tmp/diff-so-fancy-next < /tmp/raw.diff'
...................................................
29 ms: %%%%%%%%%%%%%%%%%%%%%%%% (15.0%)
30 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (32.5%)
31 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (52.5%)
Ran '/tmp/diff-so-fancy-next < /tmp/raw.diff' 50 times with average completion time of 30 ms
rwe
:perl third_party/cli_bench/cli_bench.pl '/tmp/diff-so-fancy-rwe < /tmp/raw.diff'
...................................................
53 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (22.5%)
54 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (50.0%)
55 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (27.5%)
Ran '/tmp/diff-so-fancy-rwe < /tmp/raw.diff' 50 times with average completion time of 54 ms
So we're looking at 30 ms vs 54 ms... still about 80% slower. I'm not sure how keen I am to mimic every nuance of what Git allows for color schemes at the expense of the technical debt for this code, and the slowness incurred.
Do you know what version of Git git config --get-color was added in? Is this a newer Git thing, or has it been around
forever?
Do you know what version of Git git config --get-color was added in? Is this a newer Git thing, or has it been around forever?
Forever—some digging shows it was introduced in 2007, git 1.5.4: https://github.com/git/git/commit/9ce0352258a421b654b5db145a42d07cbaef416c
The overhead can be removed entirely by bulk-fetching with --type=color --get-regexp, but would need git version to 2.18 (3.5 years ago): https://github.com/git/git/commit/63e2a0f8e9cc3d66137a72e424a8b59f1c4dbd79.
I came here after I stumbled upon the fact that diff-so-fancy neither supports hex colors nor git's default color attributes dim or ul. This feels quite unintuitive, since the colors are configured in gitconfig and basically use the same syntax as git's standard color configuration options, so I would expect this to work in the same way every other color-related git feature is configured.
@rwe Thanks for your work here! This makes the color configuration for diff-so-fancy exactly like I would it expect to work.
@scottchiefbaker
So we're looking at 30 ms vs 54 ms... still about 80% slower. I'm not sure how keen I am to mimic every nuance of what Git allows for color schemes at the expense of the technical debt for this code, and the slowness incurred.
I understand that speed is a main goal. 80% slower sounds huge when expressed in relative numbers. In absolute numbers, however, we're talking about a few milliseconds here. I would strongly prefer being able to configure colors just like in git over saving 24ms (which I won't notice, ever).
And concerning the technical debt: the patch in this PR actually removes technical debt. It does not try to mimic nuances of Git, it simply uses its features that are already there. Just look at the patchset, it's 415 lines deleted against 71 lines added.
So apart from the speed issue (which is a theoretical but not a real-world issue imo) I think there's no decent reason to not accept this change.
Is there a chance for this to get merged? Would really appreciate it!
We just landed several other commits that address ul, bright, and dim. We cover 95% of the use cases for Git color codes. At risk of making the code much more complex I think I'm OK with that percent.
If there is something specific people want to see added please open another issue specific to that. Hex colors (I didn't even know Git supported that)?
At this time, shelling out to git to get the colors affects start up time, and that's a non-starter for me right now.