diff-so-fancy icon indicating copy to clipboard operation
diff-so-fancy copied to clipboard

colors: consistent color parsing and configuration

Open rwe opened this issue 4 years ago • 6 comments

This replaces both of the diff-so-fancy's hand-rolled color reconstruction mechanisms with cached invocations of git config --get-color, which is fast and uses git's own parsing to construct the appropriate ANSI sequences.

Those invocations are cached on the config key (if there is one) or the color spec.

This does not introduce noticeable performance impact: timing git -c pager.log="${dsf}" log -p on this repo, between this branch and 1.4.2, actually shows a slight speed-up.

The commits in this PR progress as follows:

  • Naïve work fixing the color subroutine; then realizing git_ansi_color is used alongside it, and attempting to fix that too.
  • (^ Those commits are left for reference, but were made redundant).
  • Realizing that they should be consolidated, and should be using git config --get-color anyway instead of just mimicking it; and then proceeding toward some refactors to allow those to be consolidated.

Along the way, there are a couple fixes to the default/fallback handling. The last couple commits use some of the newly-exposed features (e.g. italic/strike/dim) for some improved defaults.

Background

diff-so-fancy introduced three mutually incompatible syntaxes for parsing color specs:

  • git's parsing, which is used to drive the original colored git diff output, and by the contrib/diff-highlight pager bundled with git. That syntax is like white brightred bold ul reverse and supports the 8 standard ANSI names (black/red/green/yellow/blue/magenta/cyan/white) + bright… variants; ANSI colors like 124, and hex like #ffcccc. It also supports bold, dim, italic, ul, blink, reverse, and strike attributes and their negations (nobold etc).
  • colors() which interpreted syntax like white_bold_on_black; but used inverse instead of reverse, used underline instead of ul; did not support the dim or strike attributes; did not support attribute negation; and did not understand bright… names. It also arbitrarily maintained two additional non-standard color names (orange and purple).
  • git_ansi_colors() which attempted to more-closely mimic git's parsing, but only supported bold and reverse. (No dim, italic, underline, blink, or strike; no negations).

This was frustrating, because there are overlapping config keys between diff-so-fancy, diff-highlight, and git diff, and the lowest-common-denominator configuration was a little tricky and too limited.

While trying to fix a crash related to the reverse ↔︎ inverse mismatch—which in retrospect was a red herring, since color() doesn't appear to be invoked on user config…but whatever!—it became clear that this complexity was unnecessary and likely less performant than letting git itself—which is already a dependency—do the work.

rwe avatar Oct 22 '21 22:10 rwe

Whoa... there is a LOT going on here. Lots of files and lines changed. My first reaction is that these are some pretty hefty changes that put a decent developmental burden on developers to update/maintain.

What exact problem is this solving?

We purposely kept the color parsing simple (no more than three octets), and it has generated little or no bug reports. All color parsing is done in Perl save the one call to git --config, so it should be speedy. See QuickSilver release. We purposely did not implement every ANSI feature that Git supports. If there is one you'd like to add I would be open that, but I believe we already support the configs of 99% of users.

Also, please see the third_party/cli_bench/cli_bench.pl script I wrote to test for speed changes between next and your branch. Perhaps that can illuminate and potential speed changes.

scottchiefbaker avatar Oct 23 '21 04:10 scottchiefbaker

A quick comparison of next versus this PR.

next

:perl third_party/cli_bench/cli_bench.pl /tmp/diff-so-fancy-next < /tmp/raw.diff
...................................................

23 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (35.0%)
24 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (52.5%)
25 ms: %%%%%%%%%%%%%%%%%%%% (12.5%)

Ran '/tmp/diff-so-fancy-next' 50 times with average completion time of 23 ms

rwe

:perl third_party/cli_bench/cli_bench.pl /tmp/diff-so-fancy-rwe < /tmp/raw.diff
...................................................

39 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (20.0%)
40 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (40.0%)
41 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (40.0%)

Ran '/tmp/diff-so-fancy-rwe' 50 times with average completion time of 40 ms

23 ms vs 40 ms is a slowdown of about 74%. One of our primary goals has been to keep the startup (and ultimately runtime) of this script as low as possible.

scottchiefbaker avatar Oct 23 '21 04:10 scottchiefbaker

@scottchiefbaker — Thanks for the input and additional measurements.

The timing isn't close to what I'd measured, so I think something else must be going on. Note that the benchmark you ran is only consuming the diff during the first run; after that stdin will be empty and no diff is run at all.

But…you're right: running the above with quoted args ('./diff-so-fancy < raw.diff') shows a slowdown between next 1a0ad54d at 425ms to this branch (currently cb73fa04) at 476ms, so I'll dig into what can be done about it.

Re maintenance: the intention was the opposite, actually. The change removes the need to maintain that parsing logic at all, while at the same time ensuring that the capabilities track to at least what git supports. The commit stream might be a bit too fine-grained, but if you look at the diff, there are broadly two simple changes:

  • The default values got moved into maps — just a simplification to maintain single-source-of truth to make caching easier.
  • The color parsing is accomplished with git config --get-color $config $color.

rwe avatar Oct 25 '21 20:10 rwe

@rwe oh good catch. I forgot to put quotes around my commands.

next

:perl third_party/cli_bench/cli_bench.pl '/tmp/diff-so-fancy-next < /tmp/raw.diff'
...................................................

29 ms: %%%%%%%%%%%%%%%%%%%%%%%% (15.0%)
30 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (32.5%)
31 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (52.5%)

Ran '/tmp/diff-so-fancy-next < /tmp/raw.diff' 50 times with average completion time of 30 ms

rwe

:perl third_party/cli_bench/cli_bench.pl '/tmp/diff-so-fancy-rwe < /tmp/raw.diff'
...................................................

53 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (22.5%)
54 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (50.0%)
55 ms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% (27.5%)

Ran '/tmp/diff-so-fancy-rwe < /tmp/raw.diff' 50 times with average completion time of 54 ms

So we're looking at 30 ms vs 54 ms... still about 80% slower. I'm not sure how keen I am to mimic every nuance of what Git allows for color schemes at the expense of the technical debt for this code, and the slowness incurred.

Do you know what version of Git git config --get-color was added in? Is this a newer Git thing, or has it been around forever?

scottchiefbaker avatar Oct 25 '21 20:10 scottchiefbaker

Do you know what version of Git git config --get-color was added in? Is this a newer Git thing, or has it been around forever?

Forever—some digging shows it was introduced in 2007, git 1.5.4: https://github.com/git/git/commit/9ce0352258a421b654b5db145a42d07cbaef416c

The overhead can be removed entirely by bulk-fetching with --type=color --get-regexp, but would need git version to 2.18 (3.5 years ago): https://github.com/git/git/commit/63e2a0f8e9cc3d66137a72e424a8b59f1c4dbd79.

rwe avatar Oct 25 '21 20:10 rwe

I came here after I stumbled upon the fact that diff-so-fancy neither supports hex colors nor git's default color attributes dim or ul. This feels quite unintuitive, since the colors are configured in gitconfig and basically use the same syntax as git's standard color configuration options, so I would expect this to work in the same way every other color-related git feature is configured.

@rwe Thanks for your work here! This makes the color configuration for diff-so-fancy exactly like I would it expect to work.

@scottchiefbaker

So we're looking at 30 ms vs 54 ms... still about 80% slower. I'm not sure how keen I am to mimic every nuance of what Git allows for color schemes at the expense of the technical debt for this code, and the slowness incurred.

I understand that speed is a main goal. 80% slower sounds huge when expressed in relative numbers. In absolute numbers, however, we're talking about a few milliseconds here. I would strongly prefer being able to configure colors just like in git over saving 24ms (which I won't notice, ever).

And concerning the technical debt: the patch in this PR actually removes technical debt. It does not try to mimic nuances of Git, it simply uses its features that are already there. Just look at the patchset, it's 415 lines deleted against 71 lines added.

So apart from the speed issue (which is a theoretical but not a real-world issue imo) I think there's no decent reason to not accept this change.

Is there a chance for this to get merged? Would really appreciate it!

carlfriedrich avatar Jul 18 '22 15:07 carlfriedrich

We just landed several other commits that address ul, bright, and dim. We cover 95% of the use cases for Git color codes. At risk of making the code much more complex I think I'm OK with that percent.

If there is something specific people want to see added please open another issue specific to that. Hex colors (I didn't even know Git supported that)?

At this time, shelling out to git to get the colors affects start up time, and that's a non-starter for me right now.

scottchiefbaker avatar Sep 12 '22 20:09 scottchiefbaker