altair icon indicating copy to clipboard operation
altair copied to clipboard

Getting legend for multilayer chart

Open ajasja opened this issue 6 years ago • 32 comments

Is it possible to make a legend for such a chart

import altair as alt
import numpy as np
import pandas as pd

x = np.arange(100)
data = pd.DataFrame({'x': x,
                     'sin(x)': np.sin(x / 5),
                     'data': np.sin(x / 5) + 0.3*np.random.rand(100)})

line = alt.Chart(data).mark_line(strokeWidth=6, color='orange').encode(
    x='x',
    y='sin(x)'
)

point = alt.Chart(data).mark_point(color='black').encode(
    x='x',
    y='data'
)
(line + point)

image

ajasja avatar Jun 28 '18 15:06 ajasja

I would like to get something like this image

ajasja avatar Jun 28 '18 15:06 ajasja

Legends are only created if the data within a layer is somehow grouped by a label. You can force this by adding columns of labels; for example:

x = np.arange(100)
data = pd.DataFrame({'x': x,
                     'sin(x)': np.sin(x / 5),
                     'data': np.sin(x / 5) + 0.3*np.random.rand(100),
                     'line_label': 100 * ['line'],
                     'points_label': 100 * ['points']})

line = alt.Chart(data).mark_line(strokeWidth=6, color='orange').encode(
    x='x',
    y='sin(x)',
    opacity='line_label'
)

point = alt.Chart(data).mark_point(color='black').encode(
    x='x',
    y='data',
    shape='points_label'
)
(line + point)

visualization 25

though this is admittedly a bit hacky. Also, as far as I know vega is incapable of displaying line marks within a legend as you show in your example above, though @kanitw or @domoritz may be able to correct me on that.

jakevdp avatar Jun 28 '18 16:06 jakevdp

I think we can use a custom SVG path as a symbol but haven't gotten around to make it the default for Vega-Lite lines yet.

domoritz avatar Jun 28 '18 16:06 domoritz

Thanks! This is indeed a bit hacky:) But I got even a bit closer.

x = np.arange(100)
data = pd.DataFrame({'x': x,
                     'sin(x)': np.sin(x / 5),
                     'data': np.sin(x / 5) + 0.3*np.random.rand(100),
                     'line_label': 100 * ['line'],
                     'points_label': 100 * ['points']})

line = alt.Chart(data).mark_line(strokeWidth=6, color='orange').encode(
    x='x',
    y='sin(x)',
    opacity=alt.Opacity('line_label', legend=alt.Legend(title=""))
)

point = alt.Chart(data).mark_point(color='black').encode(
    x='x',
    y='data',
    shape=alt.Shape('points_label', legend=alt.Legend(title=""))
)
(line + point)

PS: this is probably a different debate (e.g. https://github.com/altair-viz/altair/issues/947); I just found out I could do

shape=alt.Shape('points_label', title="" )

instead of

shape=alt.Shape('points_label', legend=alt.Legend(title="") )

Big kudos points! :+1: Is this documented somewhere or is it a more try and see?

ajasja avatar Jun 28 '18 16:06 ajasja

The title keyword is documented in the API docs (e.g. alt.X and used in a few examples, but I think it would be useful to have a section of the documentation dedicated to titles and labels.

Any volunteers? :smile:

jakevdp avatar Jun 28 '18 16:06 jakevdp

Looks great! How can you sort the labels in the legend?

pletka avatar Sep 18 '19 13:09 pletka

If you don't need to use different mark types for the layers, you can also use the fold transform documented at https://vega.github.io/vega-lite/docs/fold.html to convert you data to long/tidy form.

domoritz avatar Sep 18 '19 15:09 domoritz

Hi folks, I am new to altair and trying to plot weather/hydrograph data. I am able to plot the data, but I can't seem to specify the colors I want with a legend. My data and plot looks like the below.

import numpy as np
import altair as alt
x = np.arange(100)
sin = np.sin(x / 5)
data = pd.DataFrame({'x': x,
                     'sin(x)': np.sin(x / 5),
                     'q_95': sin + 100*np.random.rand(100),
                     'q_75': sin + 75*np.random.rand(100),
                     'q_50': sin + 50*np.random.rand(100),
                     'q_25': sin + 25*np.random.rand(100),
                     'q_05': sin + 5*np.random.rand(100),
                    })
perc_90 = alt.Chart(data).mark_area(color='#4292c6', opacity = .5,).encode(
    x=alt.X('x',axis=alt.Axis(title='Day')),
    y=alt.Y('q_05',axis=alt.Axis(title='cfs')),
    y2 = 'q_95',
    #fill=alt.Color("p90", legend=alt.Legend(title=''))
    
).properties(
    width=800)


perc_50 = alt.Chart(data).mark_area(color='#08519c', opacity = .5).encode(
    x=alt.X('x',axis=alt.Axis(title='Day')),
    y=alt.Y('q_25',axis=alt.Axis(title='cfs')),
    y2 = 'q_75',
    #color=alt.Color("p50", legend=alt.Legend(title=''))
    
)

median = alt.Chart(data).mark_line(color = '#08306b').encode(
    x='x',
    y='q_50',
    #opacity=alt.Color("median", legend=alt.Legend(title=''))
)

perc_90 + perc_50 + median

Out of curiosity is there a reason why altair does not allow for custom legends? Thanks, I really love the work so far.

jetilton avatar Dec 30 '19 22:12 jetilton

@jetilton Vega-Lite supports custom legends (and so does Altair). You may need to modify the scale domain and range as in https://vega.github.io/vega-lite/examples/stacked_bar_weather.html. If you have a smaller example, I can give more feedback.

domoritz avatar Jan 21 '20 22:01 domoritz

@domoritz Would you mind taking a look at this small example? I am aiming to use a selector to toggle different layered time-series but can't figure out how to generate a proper legend. This example takes the stock price dataset and I added a dummy 'Price-Earnings' ratio to layer onto the plot, and then use another single axis plot to dashboard-toggle which stock to display. The legend I want to display should identify the 'Price' and 'PE' series instead of the symbols. I understand that the data probably has to be rearranged somehow, and it may not be practical/possible, in which case, is there a way to manually create/label a legend/textbox for this use case? Thanks in advance!

    #* Testing to get legend....
    stockdata = data.stocks()
    stockdata['pe'] = stockdata['price'] / 10

    selector = alt.selection_single(
        fields=['symbol'], 
        empty='all',
        init={'symbol': 'AAPL'}
    )

    legend = alt.Chart(stockdata).mark_square(size=150).encode(
        y=alt.Y(
            'symbol:N',
            axis=alt.Axis(domain=False, ticks=False, orient='right'), title=None
        ),
        color=alt.condition(selector, 'symbol:N', alt.value('gainsboro'), legend=None)
    ).add_selection(
        selector
    )

    price = alt.Chart(stockdata).mark_line(point=True).encode(
        x='date:T',
        y='price:Q',
        color='symbol:N',
        #size='pe:Q'
    )

    pe = alt.Chart(stockdata).mark_bar().encode(
        x='date:T',
        y='pe:Q',
        color='symbol:N'
    )

    legend | (price + pe).add_selection(
                            selector
                        ).transform_filter(
                            selector
                        )
image

footfalcon avatar Feb 27 '20 11:02 footfalcon

For one thing, it's now possible to make native legends interactive:

import altair as alt
from vega_datasets import data

stockdata = data.stocks()
stockdata['pe'] = stockdata['price'] / 10

selector = alt.selection_single(
    fields=['symbol'], 
    empty='all',
    init={'symbol': 'AAPL'},
    bind='legend'
)

price = alt.Chart(stockdata).mark_line(point=True).encode(
    x='date:T',
    y='price:Q',
    color='symbol:N',
    opacity=alt.condition(selector, alt.value(1), alt.value(0))
).add_selection(
    selector
)

pe = alt.Chart(stockdata).mark_bar().encode(
    x='date:T',
    y='pe:Q',
    color='symbol:N'
).transform_filter(
    selector
)

price + pe

visualization - 2020-02-27T054454 350

Beyond that, it's not clear to me how you want your legend to be different than what is shown. Both layers have a shared color encoding that is correctly reflected in the legend.

jakevdp avatar Feb 27 '20 13:02 jakevdp

Hi Jake - thanks for your reply. I'm am aware of native legend interactivity (which is great). I should have probably mentioned more that I am new to Altair and exploring its possibilities. In this case, I am trying to see where I can take it as a mini-dashboard. The reason I want to try using the selector the way I have it is that, in my use-case:

  1. it is a long list of countries (which would get truncated as a native legend), and
  2. I want it to control several more separate plots (that would all be filtered by country selection, and
  3. I want the flexibility to control the layout.

Also, while your plot is effectively the same as mine, and the native legend does identify the stock correctly by color, it does not clearly show which series is the stock price and which is the stock PE. What I am hoping to do by creating the pseudo-legend is keep that stock identity, but also be able to display a legend which says the mark_line series is the price, and the mark_bar series is the PE.

This may not be possible, in which case, is it possible to create something like a text box to manually place on chart? I will actually be using a different color for the line and bars (which will remain constant for each stock,eg: price == red; PE == gray), so I could color code the labels in a text box to convey that information.

    stockdata = data.stocks()
    stockdata['pe'] = stockdata['price'] / 10

    selector = alt.selection_single(
        fields=['symbol'], 
        empty='all',
        init={'symbol': 'AAPL'}
    )

    legend = alt.Chart(stockdata).mark_square(size=150).encode(
        y=alt.Y(
            'symbol:N',
            axis=alt.Axis(domain=False, ticks=False, orient='right'), title=None
        ),
        color=alt.condition(selector, alt.value('firebrick'), alt.value('gainsboro'), legend=None)
    ).add_selection(
        selector
    )

    price = alt.Chart(stockdata).mark_line(point=True).encode(
        x='date:T',
        y='price:Q',
        color=alt.value('firebrick'),
        #size='pe:Q'
    )

    pe = alt.Chart(stockdata).mark_bar().encode(
        x='date:T',
        y='pe:Q',
        color=alt.value('gray')
    )

    legend | (price + pe).add_selection(
                            selector
                        ).transform_filter(
                            selector
                        )
image

Here's a very-work-in-progress snapshot of what I am trying to do...

image

footfalcon avatar Feb 27 '20 14:02 footfalcon

Unfortunately, I don't have time right now to look at anything but minimal examples that demonstrate a specific issue.

domoritz avatar Feb 27 '20 16:02 domoritz

@domoritz No problem, I will keep exploring. Am really impressed with Altair!

footfalcon avatar Feb 27 '20 16:02 footfalcon

You could do something like this:

import altair as alt
from vega_datasets import data

stockdata = data.stocks()
stockdata['pe'] = stockdata['price'] / 10

selector = alt.selection_single(
    fields=['symbol'], 
    empty='all',
    init={'symbol': 'AAPL'},
    bind='legend'
)

price = alt.Chart(stockdata).mark_line(point=True).encode(
    x='date:T',
    y='price:Q',
    color='symbol:N',
    opacity=alt.condition(selector, alt.value(1), alt.value(0))
).add_selection(
    selector
)

pe = alt.Chart(stockdata).transform_calculate(
    name='"PE Ratio"'  
).mark_bar().encode(
    x='date:T',
    y='pe:Q',
    color=alt.Color('name:N', scale=alt.Scale(scheme='greys'), legend=alt.Legend(title=None))
).transform_filter(
    selector
)

(price + pe).resolve_scale(color='independent')

visualization (63)

The grammar offers a lot of possibilities for customizing legends and scales, depending on exactly what you want to do.

jakevdp avatar Feb 27 '20 18:02 jakevdp

Thanks, I will give it a try...

footfalcon avatar Feb 28 '20 07:02 footfalcon

Hello, Since it has been some time since this question was asked, I wanted to see if there is any updates: is there a way of doing this (adding a legend when there is only one group of data inside the graph) that doesn't involve adding a column to the data and adding an additional property to the graph?

Is it possible to make a legend for such a chart

import altair as alt
import numpy as np
import pandas as pd

x = np.arange(100)
data = pd.DataFrame({'x': x,
                     'sin(x)': np.sin(x / 5),
                     'data': np.sin(x / 5) + 0.3*np.random.rand(100)})

line = alt.Chart(data).mark_line(strokeWidth=6, color='orange').encode(
    x='x',
    y='sin(x)'
)

point = alt.Chart(data).mark_point(color='black').encode(
    x='x',
    y='data'
)
(line + point)

image

RobbyJS avatar Apr 11 '20 10:04 RobbyJS

No, there is still no way to add a legend without specifying an encoding that the legend will represent.

jakevdp avatar Apr 11 '20 12:04 jakevdp

Are there any plans to have legends not based on a label? While in many cases it is easy to add a column, it not always practical as when you have simulation data involving many parameters and wanting to compare results from different simulations on the same plot.

essafik avatar Jun 19 '20 14:06 essafik

What do you mean by “legend not based on a label”? How do you imagine specifying what the legend will contain?

jakevdp avatar Jun 19 '20 15:06 jakevdp

Before switching to Altair, I was doing plots with Mathematica and you can simply specify your legend withing the plot by using the option PlotLegend ->{"line"}. But even with matplotlib you can specify the legend with the label option as in : plot(x, y, label="line"). Is something like that planned for Altair or even possible with vega-lite?

essafik avatar Jun 19 '20 19:06 essafik

Yes, in newer versions of vega-lite you can set encodings to a constant datum value, which will be used to populate the legend. Altair doesn't yet support this, though.

In Altair it would probably look something like this (Note that this does not work in the current release):

alt.Chart(data).mark_line().encode(
  x='x',
  y='y',
  color=alt.datum("My Line")
)

jakevdp avatar Jun 19 '20 19:06 jakevdp

when will this feature be available? As for my understanding of a plotting library its crucial.

gustavz avatar Jul 08 '20 08:07 gustavz

when will this feature be available?

What specifically are you asking about?

jakevdp avatar Jul 08 '20 14:07 jakevdp

What specifically are you asking about?

I believe @gustavz was asking about the ability to do "color=alt.datum("My Line")". Either way, I'd like to know also!

Also, once that functionality is supported, how can the color (like "orange") be specified as well?

dsandber avatar Oct 23 '20 16:10 dsandber

You can currently specify a color like "orange" using color=alt.value("orange")

jakevdp avatar Oct 23 '20 16:10 jakevdp

@jakevdp yeah, the question is once the functionality described by @gustavz is implemented, so that a legend item can be specified by doing "color=alt.datum("My Line")", then how can the color also be specified since the "color" was set to "My Line".

dsandber avatar Oct 23 '20 16:10 dsandber

You can define the color encoding's scale in the normal way; i.e. scale=alt.Scale(domain=["My Line"], range=["orange"])

jakevdp avatar Oct 23 '20 17:10 jakevdp

I would like to add my solution to this issue, as I was struggling a lot to create a "custom legend" for my charts. My problem was that I had Chart(data).mark_line() and then a created transform_loess from that chart, where I wanted to show that one line contain exact measured values and the other is smoothed. I used an approach from https://github.com/altair-viz/altair/issues/2430. My result is below:

r

NoName115 avatar Jun 02 '21 09:06 NoName115

In Altair it would probably look something like this (Note that this does not work in the current release):

@jakevdp - what still needs to happen for this to work in a released version of Altair?

cjw296 avatar Jun 15 '21 06:06 cjw296