plotly.py
plotly.py copied to clipboard
Fix for "Non-leaves rows are not permitted in the dataframe" with sunburst diagrams
It is sometimes useful to have non-leaf data in a Sunburst diagram. However, there is no way to tell Plotly Express to ignore or accept non-leaves.
Minimum viable example:
import pandas as pd
import plotly.express as px
lst = [['Alice', "Bob"], ['Alice', "Bob", "Carrie"], ["Alice", "Bob", "Chuck"]]
df = pd.DataFrame(lst)
fig = px.sunburst(df, path=df.columns)
Gives the error:
ValueError: ('Non-leaves rows are not permitted in the dataframe \n', 0 Alice 1 Bob 2
Name: 0, dtype: object, 'is not a leaf.')
This can be fixed by commenting out part of plotly/express/_core.py
def _check_dataframe_all_leaves(df):
df_sorted = df.sort_values(by=list(df.columns))
null_mask = df_sorted.isnull()
df_sorted = df_sorted.astype(str)
null_indices = np.nonzero(null_mask.any(axis=1).values)[0]
for null_row_index in null_indices:
row = null_mask.iloc[null_row_index]
i = np.nonzero(row.values)[0][0]
if not row[i:].all():
raise ValueError(
"None entries cannot have not-None children",
df_sorted.iloc[null_row_index],
)
df_sorted[null_mask] = ""
row_strings = list(df_sorted.apply(lambda x: "".join(x), axis=1))
#for i, row in enumerate(row_strings[:-1]):
#if row_strings[i + 1] in row and (i + 1) in null_indices:
#raise ValueError(
# "Non-leaves rows are not permitted in the dataframe \n",
# df_sorted.iloc[i + 1],
# "is not a leaf.",
#)
It would be great if px.sunburst
could have an option to disable these checks, or to skip over any row which is not a leaf.
How can I propose this as an option?
Thanks!
Just for anyone having also this issue, a workaround I used was replacing any None
of the by "null"
(or any desired string) and then remove the corresponding null values from the figure data before ploting:
df = df.applymap(lambda x: x if x else "null") # There are better approaches, only for clarity of preprocess
fig = px.icicle(
df,
path=df.columns,
)
figure_data = fig["data"][0]
mask = np.char.find(figure_data.ids.astype(str), "null") == -1
figure_data.ids = figure_data.ids[mask]
figure_data.values = figure_data.values[mask]
figure_data.labels = figure_data.labels[mask]
figure_data.parents = figure_data.parents[mask]
Just for anyone having also this issue, a workaround I used was replacing any
None
of the by"null"
(or any desired string) and then remove the corresponding null values from the figure data before ploting:df = df.applymap(lambda x: x if x else "null") # There are better approaches, only for clarity of preprocess fig = px.icicle( df, path=df.columns, ) figure_data = fig["data"][0] mask = np.char.find(figure_data.ids.astype(str), "null") == -1 figure_data.ids = figure_data.ids[mask] figure_data.values = figure_data.values[mask] figure_data.labels = figure_data.labels[mask] figure_data.parents = figure_data.parents[mask]
thanks so much!