pandas
pandas copied to clipboard
BUG: `Styler.to_excel` does not export styles and formats correctly
This is a tracker / explainer for the various issues:
- (#21221)
- (#25185)
- (#30008)
- (#34438)
Essentially these issues record 3 things:
Set_table_styles
Styler.set_table_styles
is not exported to excel. This will not be changed (at least by me). To write xlsx, excel styling needs to be attached on a per-cell basis, whereas in HTML indirect references can be created in the CSS language that the browser will parse, for example thead th
will apply to all header cells in the header section, and tbody th:nth-child(3n+0)
will apply to every third header cell in the body section starting with first. Without writing our own HTML to cell translator for the CSS language it is therefore impossible to map and account for all the complex CSS rules that can be used within set_table_styles
. This is well documented.
Todo:
- [x] We will instead propose changes that allow styling to header cells, to complement that already in place for body cells, (#41995)
Exporting formatting
The number of possible formatting constructs allowed by Python is greater than what excel offers. Excel also has specified structures that differ from Pythons structures. It is impossible to dynamically code these relationships in some ambiguous cases. Therefore this will not be implemented.
Todo:
- [ ] There is currently a pseudo CSS attribute:
number-format
which can be used to apply specific Excel based formatting. This should be much better documented with examples.
Border styles bug
Borders in CSS can be specified in many different ways, and the parsing code to translate this into excel's border structure is broken
Todo
- [x] Review the border CSS translation code and propose a solution, if only to document a single way of getting this to work. (#45312)
Hiding and Concatening
The Styler uses the base implementation of DataFrame.to_excel. It does not do any preliminary filtering and/or alteration of the ctx
object to format cells in the right place. It also doesnt react to hidden indexes and/or elements.
Todo
- [x] document some of the missing features in Styler.to_excel.
- [ ] review how this can be implemented or insert a series of small PRs gradually improving the consistency.
At one point I tried changing the backend to use existing CSS parsing libraries so that issues like border specification would be more generic, but those I considered required substantial modification, and I ran out of time to pursue this... What's out there might have changed a lot in four years.
Regarding border styles, the error in #30008 appears when a border color is defined but border style is not (or border style is none). The source of the error is in pd.io.excel._XlsxStyler
where the style mapping (("top", "style"), "top")
causes the prop dictionary to set props["top"] = {"color": "#000000"}
.
The root cause however is in io.formats.excel
under _border_style
. Whenever the function returns None
, it should instead return "none"
as a string. The reason for this is that "none"
is a valid border style. When it is set to None
, the style gets removed in remove_none
during build_xlstyle
, which leads to the TypeError.
In summary, I believe replacing each return None
with return "none"
in _border_style
will fix the error. Alternatively, set color
to None
if the style is none in build_border
.
do you know with this solution how to format your styles, e.g.:
-
border: 2px solid red;
-
border-bottom: 2px solid red
+border-top: 2px solid red;
etc.. -
border-bottom-color: red;
+border-bottom-style: solid;
+border-bottom-width: medium;
+ ...
also with xlsxwriter
and openpyxl
as different writers?
take
Any updates on this issue?
the updates have been provided where merged prs have been linked. any issues not currnetly being worked on do not have links or merged prs.
Just for anyone who still want to style their table index, here is my solution:
Note: This is a method which focuses on exporting your multi-index table as an excel with alternating colours for primal index, as long as it looks like what it is inside python, I dont care whether certain column in excel is really 'column' or 'index' inside python.
First, do the routine:
from io import BytesIO
from openpyxl import load_workbook
filename = r'<your local path>'
wb = load_workbook(filename, read_only=False, keep_vba=False)
writer = pd.ExcelWriter(filename, engine='openpyxl')
Divide your dataframe df into two parts - coloured half and blank half. I am using light blue for this example.
lightblue_primalindex=sorted(set([_[0] for _ in df.index]), key=[_[0] for _ in df.index].index)[::2]
lightblue_index=[df.index.get_loc(i) for i in lightblue_primalindex]
Here is the tricky part. Reset your dataframe as single index but preserve the structure by filling NaN to extra primal index column, this way they will not show in the styled dataframe nor in the exported excel later.
y=df.reset_index()
y.loc[y.duplicated('<primal index column name>'),'<primal index column name>']=np.nan
Then apply the colouring with style and we are golden.
lightblue_df=pd.concat([y[lightblue_index[i]] for i in range(len(lightblue_index))])
y.style.apply(lambda x: pd.Series(['background-color: lightblue'
if i in list(lightblue_df.index)
else '' for i in range(0, len(y))],index=y.index)).to_excel(writer, sheet_name='<your sheet name>', startrow=0 , startcol=0, index=False)
writer.save()
Then you will get one excel with table looking exactly the same as that in python.
References:
or just use styler.applymap_index
which is one of the PRs that addressed part of these issues
or just use
styler.applymap_index
which is one of the PRs that addressed part of these issues
didn't know we have that, tks!