plotly.py Sunburst and Treemap graph does not render with `branchvalues="total"`

Good evening everyone, I was writing up a small article, when I encountered an error with the branchvalues="total" setting on px.sunburst as well as px.treemap.

The bug being, the plot renders whitespace only.

Changing the branchvalues parameter back to remainder renders the graph as expected. But since I am summing up the views, the format is invalid.

Can anyone reproduce these results? Any ideas what I can do to circumvent this issue?

Thanks in advance! Cheers

Minimum working sample

Treemap `branchvalues="total"`

Treemap branchvalues="total"

Treemap `branchvalues="remainder"`

Treemap branchvalues="remainder"

Sunburst `branchvalues="total"`

Sunburst branchvalues="total"

Sunburst `branchvalues="remainder"`

Sunburst branchvalues="remainder"

Intermediate Dataset

path	score	views	parent
music/pop/jackson/billie_jean.mp3	0.8000	1000	music/pop/jackson
music/pop/jackson/beat_it.mp3	0.9000	2000	music/pop/jackson
music/pop/abba/dancing_queen.mp3	0.7000	1500	music/pop/abba
music/pop/abba/voulez-vous/voulez-vous.mp3	0.7500	1500	music/pop/abba/voulez-vous
music/pop/abba/voulez-vous/summer_night_city.mp3	0.8000	1500	music/pop/abba/voulez-vous
music/pop/abba/waterloo.mp3	0.8000	1500	music/pop/abba
music/pop/abba/chiquitita.mp3	0.7000	1500	music/pop/abba
music/pop/abba/s.o.s.mp3	0.7000	1500	music/pop/abba
music/rock/queen/bohemian_rhapsody.mp3	0.9000	3000	music/rock/queen
music/pop/abba	0.7000	6000	music/pop
music/pop/abba/voulez-vous	0.7750	3000	music/pop/abba
music/pop/jackson	0.8500	3000	music/pop
music/rock/queen	0.9000	3000	music/rock
music/pop	0.7750	9000	music
music/rock	0.9000	3000	music
music	0.8375	12000	None

Source code

# %%
import typing as t
import pandas as pd
import plotly.express as px

# %%
data = data = pd.DataFrame([ 
  { 'path': 'music/pop/jackson/billie_jean.mp3', 'score': 0.8, 'views': 1000 },
  { 'path': 'music/pop/jackson/beat_it.mp3', 'score': 0.9, 'views': 2000 },
  { 'path': 'music/pop/abba/dancing_queen.mp3', 'score': 0.7, 'views': 1500 },

  { 'path': 'music/pop/abba/voulez-vous/voulez-vous.mp3', 'score': 0.75, 'views': 1500 },
  { 'path': 'music/pop/abba/voulez-vous/summer_night_city.mp3', 'score': 0.8, 'views': 1500 },
  { 'path': 'music/pop/abba/waterloo.mp3', 'score': 0.8, 'views': 1500 },
  { 'path': 'music/pop/abba/chiquitita.mp3', 'score': 0.7, 'views': 1500 },
  { 'path': 'music/pop/abba/s.o.s.mp3', 'score': 0.7, 'views': 1500 },  
  { 'path': 'music/rock/queen/bohemian_rhapsody.mp3', 'score': 0.9, 'views': 3000 },
])

# %%

col_path: str = 'path'
col_parent: str = 'parent'
def path_parent_fn(path):
  path = path.split('/')
  path = '/'.join(path[:-1]) if len(path) > 0 else ''
  path = path.strip()
  return path if len(path) > 0 else None 

aggregation = { 'score': 'median', 'views': 'sum' }


# %%
PathT = t.TypeVar('PathT')
Axis = t.Union[int, str]

def create_hierarchy_data(
  data: pd.DataFrame,
  col_path: Axis,
  col_parent: Axis,
  path_parent_fn: t.Callable[[PathT], t.Union[PathT, None]],
  aggregation: t.Any
):
  data[col_parent] = data[col_path].apply(path_parent_fn)

  def parent_in_data_or_na():
    return data[col_parent].isin(data[col_path]) | data[col_parent].isna()
  
  while not parent_in_data_or_na().all(skipna=True):
    missing_parents = data[data[col_parent].isin(data[col_path]) == False]
    missing_parents = missing_parents.groupby(col_parent, as_index=False)
    missing_parents_keys = missing_parents.groups.keys()
    missing_parents = missing_parents.agg(aggregation)
    missing_parents[col_path] = missing_parents_keys
    missing_parents[col_parent] = missing_parents[col_path].apply(path_parent_fn)
    data = pd.concat([
      data,
      missing_parents
    ], ignore_index=True)
  data = data[data[col_path].isna() == False]
  return data


# %%
data = create_hierarchy_data(data, col_path, col_parent, path_parent_fn, aggregation)
data

# %%

data = create_hierarchy_data(data, col_path, col_parent, path_parent_fn, aggregation)
fig = px.treemap(data, names=col_path, parents=col_parent, values='views', color='score', color_continuous_midpoint=0.5, branchvalues='total')
fig

# %%
data = create_hierarchy_data(data, col_path, col_parent, path_parent_fn, aggregation)
fig = px.sunburst(data, names=col_path, parents=col_parent, values='views', color='score', color_continuous_midpoint=0.5, branchvalues='total')
fig

Aug 02 '23 22:08 ProphetLamb

The only method to render this data, using branchvalues='remainder' and zero values in all nodes but leaves.

Applying this fix yields an aggregation function, where every computed value is zero:

aggregation = { 'score': 'median', 'views': 'sum', 'layout_value': lambda x: 0 }

The data['layout_value'] column is initialized from the current value column data['views'] and then used instead of value when rendering the figure:

data = get_data()
data['layout_value'] = data['views']
data = create_hierarchy_data(data, col_path, col_parent, path_parent_fn, aggregation)
fig = px.treemap(data, names=col_path, parents=col_parent, values='layout_value', color='score', color_continuous_midpoint=0.5, branchvalues='remainder')
fig

This figure has a zero size for all generated nodes, but the branchvalues='remainder' ensure, that the generated parent node inherits the size of the child nodes.

Zero remainder treemap

Finally, rendering a custom hover and using the real value, instead of the plotly workaround value, yields a beautiful graph.

data = get_data()
data['layout_value'] = data['views']
data = create_hierarchy_data(data, col_path, col_parent, path_parent_fn, aggregation)
fig = px.treemap(data, names=col_path, parents=col_parent, values='layout_value', color='score', color_continuous_midpoint=0.5, hover_data=['views', 'score'])
fig.update_traces(hovertemplate='''
<b>%{label}</b><br>Votes: %{customdata[0]}<br>Score: %{customdata[1]}
''')
fig

Zero remainder treemap with custom hovertemplate

Regardless of the existence of this workaround, compromising the "low-code" claim plotly asserts, a fix would still be great!

Aug 03 '23 06:08 ProphetLamb

I also have this issue. But it seems to be dataset dependent.

My real dataset has 3 levels and highly variables values from 0 to trillions. This doesn't render. However, if I remove the 3rd level, I manage to render the 2 first levels fine.

When I use a test dataset with also 3 levels but less variability, with the exact same code, it works just fine with the 3 levels.
So the issue seems to be data dependent. I haven't managed yet to isolate what property of the data makes it fail.
Is there any way to get some plotly error logs to understand where the render fails?

Nov 30 '23 08:11 NicolasPA

Ok, so my issue is that the sums were not correctly adding up from the 3rd level in my real dataset.
You may want to check if it was the case for you too.

I wish Plotly produced an error for this kind of render breaking mistakes.

Dec 01 '23 05:12 NicolasPA

Hi - we are tidying up stale issues and PRs in Plotly's public repositories so that we can focus on things that are still important to our community. Since this one has been sitting for a while, I'm going to close it; if it is still a concern, please add a comment letting us know what recent version of our software you've checked it with so that I can reopen it and add it to our backlog. If you'd like to submit a PR, we'd be happy to prioritize a review, and if it's a request for tech support, please post in our community forum. Thank you - @gvwilson

Jul 11 '24 22:07 gvwilson

plotly.py plotly.py copied to clipboard

Sunburst and Treemap graph does not render with `branchvalues="total"`

Minimum working sample

Treemap branchvalues="total"

Treemap branchvalues="remainder"

Sunburst branchvalues="total"

Sunburst branchvalues="remainder"

Intermediate Dataset

Source code

plotly.py
plotly.py copied to clipboard

Treemap `branchvalues="total"`

Treemap `branchvalues="remainder"`

Sunburst `branchvalues="total"`

Sunburst `branchvalues="remainder"`