mir_eval Discussion: custom colormaps for chord plots

In prototyping some viz code recently, I got to thinking that our generic handling of categorical labels in the display module could be better adapted for chord plots using the labeled segment display. This example, pulled from a librosa issue discussion from a while back https://github.com/librosa/librosa/issues/1370#issuecomment-1189701677, shows how the current behavior looks:

This is basically fine in that distinct categories get distinct colors, but there are a few significant drawbacks:

There is no guaranteed consistency between plots that use different sets of labels. F#:maj is blue in the above, but it could be orange in a different track. This leads to some probably unnecessary context switching when viewing multiple annotations.
There is no musical logic to the color organization - it's essentially down to the order in which the labels appear in the track.
There is no notion of "color proximity". Hue and pitch class are both cyclic spaces (more or less), and this could be exploited to convey more information visually.

I started digging around the matplotlib options, and there really aren't any existing colormaps that would make much sense here. Seaborn actually provides some nice functionality here for hsl/husl colormap construction, but I don't think we want to add seaborn to the dependency stack here. Instead, I propose that we pre-generate a handful of custom colormaps for use in pitch-related plots.

Protoype colormaps

Since these are categorical colormaps, we really don't need to worry about things like equal luminosity or perceptual uniformity. What we do need to worry about is discriminability (under accessibility constraints, etc). For this reason, I'm starting with seaborn's hsl generator, using n=12 to get evenly spaced hues according to pitch class in chromatic order. I'm, generating light, medium, and dark versions of each, which I'm for now imagining as being used for other/major/minor qualities. (I'm not married to that particular idea, but it seems like a reasonable enough starting point.)

We can plot these colormaps in both chromatic and circle-of-fifths order, resulting in the following (chatbot-generated/vibecoded, but LGTM):

We can also do the reverse, generating in CoF order instead of chromatic order:

After chromatic<->cof translation, these essentially act like the tab10-style categorical maps present in matplotlib, maximally dispersing similar hues, except that we actually have 12 to work with.

Intended use

Color is a bit limited for what chord annotations convey, so there is always going to be a loss of information here. At present, this information loss is arbitrary, but I think a reasonable case can be made that we can prioritize the following concepts by importance: root note (hue), major/minor (3rd) quality (value), everything else (dim/aug/sus, sixths, sevenths, extensions). This motivates my proposal above for using the center ring palette for major-like qualities (ie maj, dom7, maj6 and the like), the inner ring (dark) for minor-like qualities (min, min7, etc), and the outer ring (light) for everything else.

The no-chord symbol (and, I guess, out-of-gamut X) would be represented as a neutral gray (center disc in the plots above).

In terms of which hsl sweep mode to use, I think either the chromatic or cof orderings can be justified, perhaps with different use cases. I rather like the cof sweep as it makes adjacent pitches (probably dissonant) look maximally distinct, but I'd like to hear thoughts from others on this. (To be clear, I think we can and should include both - this is really just a question of defaults.)

Example call signature

I envision the (default) user-facing code to look something like:

>>> mir_eval.display.chord(intervals, labels, major='medium', minor='dark', other='light', sweep='fifths')

So a user could opt to change the value shading for different qualities, or the domain of the hsl sweep, etc.

Notes

I did check these palettes with the WCAG accessibility checker under deuteranomaly and protanomaly, and they seem pretty discriminable to me. Probably the value fields could be optimized to minimize confusion between light/medium/dark rings, but overall I think it's a pretty solid start.

The one major drawback of the proposed idea is that distinct but similar chords would render as visually identical. So a region that alternates between "F#:maj" and "F#:7" would look like one solid region. Likewise, bass notes are essentially ignored, etc etc. Probably some of this could be massaged around by using fill patterns to provide more nuance than N/maj/min/other, but I do worry that the end result would end up looking like clown pants if not implemented carefully.

Jun 24 '25 02:06 bmcfee

Following up with a fleshed out prototype. Here's a comparison of I Got Rhythm, 30-60 seconds, out of the JAAH collection.

First, using the current implementation (matplotlib defaults, mir_eval.segment display):

Next, using a chromatic sweep for the palette as described above:

And finally using a circle-of-fifths sweep:

Probably the value levels for major and minor could still be tuned a bit, and we could throw in some pattern fill for 6 and 7 chords, but I think either of the new options are better than the matplotlib defaults.

Jun 24 '25 16:06 bmcfee

This is really cool! I would be interested to hear what a chordy person thinks of this too. The only major drawback seems to be

So a region that alternates between "F#:maj" and "F#:7" would look like one solid region.

I think this is not great. Some ways to deal with it (none are great ways, IMO):

Adding some kind of pattern overlay (e.g. hatches) to designate chord subtypes. I think within a given song the pattern would have to be consistent across roots (so if a C:7 got a upward-slanting hash, then D:7 should get the same).
Additionally changing the color/shading for different subtypes. I'm forgetting how chord evaluation is done, is there ever a case where we say that two chord subtypes with one root have a different distance to a chord with a different root? If so I suppose that could be mapped to hue, somehow... probably not...

Jun 25 '25 01:06 craffel

Eh I more or less identify as chordy at this point 😆

Adding some kind of pattern overlay (e.g. hatches) to designate chord subtypes. I think within a given song the pattern would have to be consistent across roots (so if a C:7 got a upward-slanting hash, then D:7 should get the same).

Yes this is something I had in mind and was protoyping yesterday. As noted above, it's difficult to get something that doesn't end up looking like clown pants. But the idea is generally that root dictates hue, third (major/minor/absent) dictates value, and then 5th/6th/7th dictates hatch patterns (eg / for a natural 7, \ for a flat 7, o for dim, x for 6, and so on).

That said, I think having just root+third for color coding already does quite a lot of good, and I could imagine wanting to disable hatching to get a a cleaner viz.

Additionally changing the color/shading for different subtypes. I'm forgetting how chord evaluation is done, is there ever a case where we say that two chord subtypes with one root have a different distance to a chord with a different root?

Yes, though not with the most common metrics. The MIREX metric would be the exception, where the root doesn't specifically matter as long as there are enough matching pitch classes between the two chords. But in general, I think matching root to hue (either chromatically or by circle of fifths) is the most intuitive choice here.

I am leaning even more toward CoF as the default btw. Here's a revised example like the above (now using Giant Steps, because why not) with simplified figure aesthetics to highlight the differences.

The bottom example (CoF) clearly separates the large jumps (Eb→A purple→green or Bb→F pink→blue) from the smaller jumps which follow a continuous hue path. In the middle example, the chromatic sweep tends to jumble high-contrast colors together (which is maybe what you'd expect in giant steps, but I'd rather have high-contrast more clearly identify with dissonance).

Jun 25 '25 13:06 bmcfee

Looks great (though I see there are no chord subtype clashes in this example) and I generally defer to you on this. Perhaps if the hatches look clowny then you could make it optional, for those special cases where it looks misleadingly uniform without any hatching.

Jun 25 '25 13:06 craffel

Right, the hatches aren't implemented yet - that was just an unrelated aside to document my thinking on the choice of defaults and utility of root+third viz.

Jun 25 '25 13:06 bmcfee

Here's an update of the above, 30 seconds now instead of 15, and showing the hatch patterns for 7ths:

and here's another track (Night in Tunisia), 0-60sec, to show some of the other patterns:

Probably the line thickness could be adapted a little here, but yeah... clown pants 😆

Jun 25 '25 17:06 bmcfee

Haha it is less clowny than I feared. I think making it optional (and default to not show) wouldn't be harmful.

Jun 25 '25 17:06 craffel

Yeah, I mean if we stick to 7ths, it has a nice fruit stripe vibe.

The patterns look pretty bad if used as an overlay on a spectrogram:

but looks okay with patterns disabled:

Jun 25 '25 18:06 bmcfee

Yeah, probably ok to leave that up to the user

Jun 25 '25 18:06 craffel

Hello both,

Paul Smith's RTW department like this. Also I appreciate the repertoire of jazz tunes you have chosen. Congratulations @bmcfee on getting all these examples ready for discussion

I agree that CoF coloring should be the default. Good that you have a gray color for N (no chords).

I would think that it's basically impossible to meaningfully visualize the harmony of "Giant Steps" without some kind of diminution. Here is the way it is typically being analyzed in class:

(source: https://cdsguitarblog.wordpress.com/2017/04/19/visual-learning-and-work-in-progress-giant-steps/)

But doing this would require a dedicated symbolic MIR algorithm for identifying chord progressions and functional boundaries, etc. which I'm guessing is left to the user. My point is that there's a limit to how much we can make sense of an advanced lead sheet like "Giant Steps" by simply coloring regions.

In a Python notebook, do you think it would be possible to have the chord/key label appear when hovering the region? Or, in a static figure, be (optionally) written down on every region, if space allows?

Sincerely,

Vincent.

Jul 02 '25 07:07 lostanlen

Thanks @lostanlen ! I completely agree that there's a limit to what we can achieve with a static display like this. My aim here is primarily to strike a good balance on average between fidelity to the underlying annotation and simplicity of the display.

The color coded lead sheet you posted is quite helpful. I don't think we want to go quite that far (grouping distinct roots together), but i think the cof palette ordering actually gets pretty close.

Regarding labels and hovering: the code already supports embedding a labeled text box inside each segment, but I've disabled that in these examples due to clutter.

A hover pop-up/tooltip is certainly possible to do, but i think it can be a separate feature that applies to all labeled interval displays.

Jul 02 '25 10:07 bmcfee

Quick follow up, i was reminded of a bug i ran into while doing the recent matplotlib modernization that broke clipping of text box annotations on segment displays.

I'll try to work out a fix for that when making a pr for this feature.

Jul 02 '25 21:07 bmcfee

Looking into the clipping path stuff today - for the record, the original issue I encountered is documented here https://github.com/matplotlib/matplotlib/issues/28717 , and this is confirmed to be fixed by subsequent matplotlib releases (works on 3.10.3).

However, there is now a different but related issue. If we let matplotlib draw the full extent of the plot, it looks fine. This plot uses jams and librosa to truncate the annotation and audio to 30 seconds prior to display:

However, if we set the x limits on the plot after rendering the figure, the annotations fall off the edge of the axes:

I'm not yet sure if this is a bug on our side or mpl's side.

EDIT: stripped down to a minimal reproducible example and posted at https://github.com/matplotlib/matplotlib/issues/30276

Jul 08 '25 13:07 bmcfee

I expect the label clipping issue is not going to be resolved any time soon, so I'll just move ahead with this PR as is. (When mpl fixes it, we can update chord and segment at the same time.)

Final point to work out here: I'd like the new colormaps to be accessible and registered in matplotlib when mir_eval is imported. Anyone have good suggestions for names? We could do something like: pitch, pitch_light, pitch_dark (for chromatic) and fifths etc for circle-of-fifths. This is probably fine, but I also wouldn't mind something more whimsical if anyone has an idea.

Aug 04 '25 17:08 bmcfee