pandas icon indicating copy to clipboard operation
pandas copied to clipboard

BUG: `Styler.to_excel` does not export styles and formats correctly

Open attack68 opened this issue 3 years ago • 17 comments

This is a tracker / explainer for the various issues:

  • (#21221)
  • (#25185)
  • (#30008)
  • (#34438)

Essentially these issues record 3 things:

Set_table_styles

Styler.set_table_styles is not exported to excel. This will not be changed (at least by me). To write xlsx, excel styling needs to be attached on a per-cell basis, whereas in HTML indirect references can be created in the CSS language that the browser will parse, for example thead th will apply to all header cells in the header section, and tbody th:nth-child(3n+0) will apply to every third header cell in the body section starting with first. Without writing our own HTML to cell translator for the CSS language it is therefore impossible to map and account for all the complex CSS rules that can be used within set_table_styles. This is well documented.

Todo:

  • [x] We will instead propose changes that allow styling to header cells, to complement that already in place for body cells, (#41995)

Exporting formatting

The number of possible formatting constructs allowed by Python is greater than what excel offers. Excel also has specified structures that differ from Pythons structures. It is impossible to dynamically code these relationships in some ambiguous cases. Therefore this will not be implemented.

Todo:

  • [ ] There is currently a pseudo CSS attribute: number-format which can be used to apply specific Excel based formatting. This should be much better documented with examples.

Border styles bug

Borders in CSS can be specified in many different ways, and the parsing code to translate this into excel's border structure is broken

Todo

  • [x] Review the border CSS translation code and propose a solution, if only to document a single way of getting this to work. (#45312)

Hiding and Concatening

The Styler uses the base implementation of DataFrame.to_excel. It does not do any preliminary filtering and/or alteration of the ctx object to format cells in the right place. It also doesnt react to hidden indexes and/or elements.

Todo

  • [x] document some of the missing features in Styler.to_excel.
  • [ ] review how this can be implemented or insert a series of small PRs gradually improving the consistency.

attack68 avatar Jun 28 '21 07:06 attack68

At one point I tried changing the backend to use existing CSS parsing libraries so that issues like border specification would be more generic, but those I considered required substantial modification, and I ran out of time to pursue this... What's out there might have changed a lot in four years.

jnothman avatar Sep 27 '21 04:09 jnothman

Regarding border styles, the error in #30008 appears when a border color is defined but border style is not (or border style is none). The source of the error is in pd.io.excel._XlsxStyler where the style mapping (("top", "style"), "top") causes the prop dictionary to set props["top"] = {"color": "#000000"}.

The root cause however is in io.formats.excel under _border_style. Whenever the function returns None, it should instead return "none" as a string. The reason for this is that "none" is a valid border style. When it is set to None, the style gets removed in remove_none during build_xlstyle, which leads to the TypeError.

In summary, I believe replacing each return None with return "none" in _border_style will fix the error. Alternatively, set color to None if the style is none in build_border.

tehunter avatar Jan 07 '22 19:01 tehunter

do you know with this solution how to format your styles, e.g.:

  • border: 2px solid red;
  • border-bottom: 2px solid red + border-top: 2px solid red; etc..
  • border-bottom-color: red; + border-bottom-style: solid; + border-bottom-width: medium; + ...

also with xlsxwriter and openpyxl as different writers?

attack68 avatar Jan 07 '22 21:01 attack68

take

tehunter avatar Jan 08 '22 20:01 tehunter

Any updates on this issue?

vovavili avatar Aug 09 '22 22:08 vovavili

the updates have been provided where merged prs have been linked. any issues not currnetly being worked on do not have links or merged prs.

attack68 avatar Aug 10 '22 04:08 attack68

Just for anyone who still want to style their table index, here is my solution:

Note: This is a method which focuses on exporting your multi-index table as an excel with alternating colours for primal index, as long as it looks like what it is inside python, I dont care whether certain column in excel is really 'column' or 'index' inside python.

First, do the routine:

from io import BytesIO
from openpyxl import load_workbook
filename = r'<your local path>'
wb = load_workbook(filename, read_only=False, keep_vba=False)
writer = pd.ExcelWriter(filename, engine='openpyxl')

Divide your dataframe df into two parts - coloured half and blank half. I am using light blue for this example.

lightblue_primalindex=sorted(set([_[0] for _ in df.index]), key=[_[0] for _ in df.index].index)[::2]
lightblue_index=[df.index.get_loc(i) for i in lightblue_primalindex]

Here is the tricky part. Reset your dataframe as single index but preserve the structure by filling NaN to extra primal index column, this way they will not show in the styled dataframe nor in the exported excel later.

y=df.reset_index()
y.loc[y.duplicated('<primal index column name>'),'<primal index column name>']=np.nan

Then apply the colouring with style and we are golden.

lightblue_df=pd.concat([y[lightblue_index[i]] for i in range(len(lightblue_index))])
y.style.apply(lambda x: pd.Series(['background-color: lightblue' 
                            if i in list(lightblue_df.index)
                            else '' for i in range(0, len(y))],index=y.index)).to_excel(writer, sheet_name='<your sheet name>', startrow=0 , startcol=0, index=False)
writer.save()

Then you will get one excel with table looking exactly the same as that in python.

111 222

References:

  1. Style single-index dataframe using style.applymap
  2. Style multi-index dataframe using Styler.set_table_styles

Freud16 avatar Dec 19 '22 16:12 Freud16

or just use styler.applymap_index which is one of the PRs that addressed part of these issues

attack68 avatar Dec 19 '22 16:12 attack68

or just use styler.applymap_index which is one of the PRs that addressed part of these issues

didn't know we have that, tks!

Freud16 avatar Dec 19 '22 16:12 Freud16