StyleFrame icon indicating copy to clipboard operation
StyleFrame copied to clipboard

'apply_style_by_indexes' fails if the dataframe has a column called "index"

Open GuidoBorrelli opened this issue 3 years ago • 8 comments

Hi, I am using the following versions:

  • Python: 3.8.10
  • StyleFrame: 4.0.0
  • Pandas: 1.3.1
  • Openpyxl: 3.0.7

The following code:

def _decorate_excel(df: pd.DataFrame, xl: str):
  excel_writer = StyleFrame.ExcelWriter(xl)
  sf = StyleFrame(df.reset_index())

  for color, col_name in product(TRAFFIC_LIGHTS_COLORS, TRAFFIC_LIGHTS_COLS):
      # Set the background color
      style = Styler(bg_color=color.lower(), font_color=utils.colors.black)
  
      sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
                                cols_to_style=col_name,
                                styler_obj=style,
                                overwrite_default_style=False)

where color can be YELLOW, GREEN, etc. and col_name are some of the columns of my DataFrame transformed into a StyleFrame.

Though I get the following error:

  File "/home/mlpoc/projects/churn-v2/vchurn/main_predict.py", line 40, in _decorate_excel
    sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
  File "/home/mlpoc/miniconda3/envs/churn/lib/python3.8/site-packages/styleframe/style_frame.py", line 638, in apply_style_by_indexes
    self.iloc[self.index.get_loc(index), self.columns.get_loc(col)].style = style_to_apply
  File "/home/mlpoc/miniconda3/envs/churn/lib/python3.8/site-packages/pandas/core/generic.py", line 5478, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'Series' object has no attribute 'get_loc'

GuidoBorrelli avatar Nov 16 '21 14:11 GuidoBorrelli

Hi. Can you please provide a better example?

If your dataframe has multi-index please be aware that styleframe does not currently support multi-indexes.

I've tried the below contrived example and it seems to be working:

from itertools import product
from styleframe import StyleFrame, Styler

TRAFFIC_LIGHT_COLORS = ('yellow', 'green', 'red')
TRAFFIC_LIGHT_COLS = ('yellow', 'green', 'red')

excel_writer = StyleFrame.ExcelWriter('output.xlsx')
sf = StyleFrame({'yellow': ['yellow', 2], 'green': [1, 'green'], 'red': [1, 'red']})

for color, col_name in product(TRAFFIC_LIGHT_COLORS, TRAFFIC_LIGHT_COLS):
    styler = Styler(bg_color=color.lower())
    sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
                              cols_to_style=col_name,
                              styler_obj=styler,
                              overwrite_default_style=False)
    
sf.to_excel(excel_writer).save()

image

DeepSpace2 avatar Nov 16 '21 15:11 DeepSpace2

Hi. Can you please provide a better example?

If your dataframe has multi-index please be aware that styleframe does not currently support multi-indexes.

I've tried the below contrived example and it seems to be working:

from itertools import product
from styleframe import StyleFrame, Styler

TRAFFIC_LIGHT_COLORS = ('yellow', 'green', 'red')
TRAFFIC_LIGHT_COLS = ('yellow', 'green', 'red')

excel_writer = StyleFrame.ExcelWriter('output.xlsx')
sf = StyleFrame({'yellow': ['yellow', 2], 'green': [1, 'green'], 'red': [1, 'red']})

for color, col_name in product(TRAFFIC_LIGHT_COLORS, TRAFFIC_LIGHT_COLS):
    styler = Styler(bg_color=color.lower())
    sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
                              cols_to_style=col_name,
                              styler_obj=styler,
                              overwrite_default_style=False)
    
sf.to_excel(excel_writer).save()

image

Hi

In the first message I added the signature of the method. I think the issue comes from the fact that I get the StyleFrame from a Pandas Dataframe and then the Styleframe library doesn't find something it would actually expect.

Could you run your snippet of code by using this startup?


TRAFFIC_LIGHT_COLORS = ('yellow', 'green', 'red')
TRAFFIC_LIGHT_COLS = ('status1', 'status2', 'status3')

excel_writer = StyleFrame.ExcelWriter('output.xlsx')
df = pd.DataFrame({"status1": ["yellow", "green", "yellow"], "status2": ["yellow", "red", "red"], "status3": ["green", "green", "green"]}
sf = StyleFrame(df)

GuidoBorrelli avatar Nov 16 '21 16:11 GuidoBorrelli

I still can't reproduce the issue.

Full code:

import pandas as pd

from itertools import product
from styleframe import StyleFrame, Styler

TRAFFIC_LIGHT_COLORS = ('yellow', 'green', 'red')
TRAFFIC_LIGHT_COLS = ('status1', 'status2', 'status3')

df = pd.DataFrame({"status1": ["yellow", "green", "yellow"],
                   "status2": ["yellow", "red", "red"],
                   "status3": ["green", "green", "green"]})

sf = StyleFrame(df)

excel_writer = StyleFrame.ExcelWriter('output.xlsx')

for color, col_name in product(TRAFFIC_LIGHT_COLORS, TRAFFIC_LIGHT_COLS):
    styler = Styler(bg_color=color.lower())
    sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
                              cols_to_style=col_name,
                              styler_obj=styler,
                              overwrite_default_style=False)

sf.to_excel(excel_writer).save()

image

from styleframe import version

print(version.get_all_versions())
Python 3.8.12 (default, Oct 13 2021, 09:15:35)
[GCC 10.2.1 20210110]
pandas 1.3.1
openpyxl 3.0.7
StyleFrame 4.0.0

DeepSpace2 avatar Nov 16 '21 16:11 DeepSpace2

I could reproduce it by exporting just 2 rows of my dataset and plotting some characteristics of it.

import pandas as pd
from itertools import product
from styleframe import StyleFrame, Styler

df = pd.read_pickle("df_tst.pkl")
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 6 columns):

Column   Non-Null Count  Dtype 
 0   index    2 non-null      int64 
 1   STATUS1  2 non-null      object
 2   STATUS2  2 non-null      object
 3   STATUS3  2 non-null      object
 4   STATUS4  2 non-null      object
 5   STATUS5  2 non-null      object

dtypes: int64(1), object(5)
memory usage: 224.0+ bytes

df

index | STATUS1 | STATUS2 | STATUS3 | STATUS4 | STATUS5
| -- | -- | -- | -- | -- | --
0 | RED | GREEN | RED | RED | RED
1 | RED | RED | RED | RED | RED
TRAFFIC_LIGHT_COLORS = ('YELLOW', 'GREEN', 'RED')
TRAFFIC_LIGHT_COLS = ('STATUS1', 'STATUS2', 'STATUS3', 'STATUS4', 'STATUS5')
sf = StyleFrame(df)
excel_writer = StyleFrame.ExcelWriter('output.xlsx')

for color, col_name in product(TRAFFIC_LIGHT_COLORS, TRAFFIC_LIGHT_COLS):
    styler = Styler(bg_color=color.lower())
    sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
                              cols_to_style=col_name,
                              styler_obj=styler,
                              overwrite_default_style=False)
AttributeError                            Traceback (most recent call last)
<ipython-input-4-be916ccb326b> in <module>
     12 for color, col_name in product(TRAFFIC_LIGHT_COLORS, TRAFFIC_LIGHT_COLS):
     13     styler = Styler(bg_color=color.lower())
---> 14     sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
     15                               cols_to_style=col_name,
     16                               styler_obj=styler,

~/miniconda3/envs/py38/lib/python3.8/site-packages/styleframe/style_frame.py in apply_style_by_indexes(self, indexes_to_style, styler_obj, cols_to_style, height, complement_style, complement_height, overwrite_default_style)
    636             index.style = style_to_apply
    637             for col in cols_to_style:
--> 638                 self.iloc[self.index.get_loc(index), self.columns.get_loc(col)].style = style_to_apply
    639 
    640         if height:

~/miniconda3/envs/py38/lib/python3.8/site-packages/pandas/core/generic.py in __getattr__(self, name)
   5476         ):
   5477             return self[name]
-> 5478         return object.__getattribute__(self, name)
   5479 
   5480     def __setattr__(self, name: str, value) -> None:

AttributeError: 'Series' object has no attribute 'get_loc'
from styleframe import version
print(version.get_all_versions())
Python 3.8.10 (default, Jun  4 2021, 15:09:15) 
[GCC 7.5.0]
pandas 1.3.1
openpyxl 3.0.7
StyleFrame 4.0.0

Ps. how do you format so well your code and outputs? Are you pasting from a notebook?

GuidoBorrelli avatar Nov 17 '21 16:11 GuidoBorrelli

Can you please share the actual pkl file that reproduces this issue (or a smaller one that still reproduces the issue)?

I'm formatting code snippets with code blocks and the python language tag:

```python

code here

```

DeepSpace2 avatar Nov 17 '21 18:11 DeepSpace2

Thanks for your answer again. I created a CSV file out of the guilty pickle one and once reloaded as CSV it still reproduces error. Safer than sharing pickle files :)

Namely:


import pandas as pd
from itertools import product
from styleframe import StyleFrame, Styler

df_pkl = pd.read_pickle("df_tst.pkl")
df_pkl.info()

df_pkl.to_csv("df_test.csv", index=False)
df_csv = pd.read_csv("df_test.csv")

df = df_csv

df_test.csv

Then by running

TRAFFIC_LIGHT_COLORS = ('YELLOW', 'GREEN', 'RED')
TRAFFIC_LIGHT_COLS = ('STATUS1', 'STATUS2', 'STATUS3', 'STATUS4', 'STATUS5')
sf = StyleFrame(df)
excel_writer = StyleFrame.ExcelWriter('output.xlsx')

for color, col_name in product(TRAFFIC_LIGHT_COLORS, TRAFFIC_LIGHT_COLS):
    styler = Styler(bg_color=color.lower())
    sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
                              cols_to_style=col_name,
                              styler_obj=styler,
                              overwrite_default_style=False)

I get the error mentioned in the above comments.

GuidoBorrelli avatar Nov 17 '21 18:11 GuidoBorrelli

Great, I've managed to reproduce the issue. The reason is that you have a column called 'index'. This is easily reproducible:

df = pd.DataFrame({'index': [1, 2]})
sf = StyleFrame(df)
sf.apply_style_by_indexes(sf.data_df.index, Styler())

AttributeError: 'Series' object has no attribute 'get_loc'

sf = StyleFrame({'index': [1, 2]}) produces the same error.

As a quick workaround I'd suggest that you rename the column before converting to styleframe:

df = pd.DataFrame({'index': [1, 2]})
sf = StyleFrame(df.rename(columns={'index': '_index'}))
sf.apply_style_by_indexes(sf.data_df.index, Styler())

# NO ERROR

I'll have to do some digging to permanently fix this.

DeepSpace2 avatar Nov 17 '21 20:11 DeepSpace2

Oh great, that's a quick workaround. Thank you.

Best, Guido

GuidoBorrelli avatar Nov 17 '21 20:11 GuidoBorrelli

Fixed in version 4.2

DeepSpace2 avatar Oct 06 '22 17:10 DeepSpace2