StyleFrame icon indicating copy to clipboard operation
StyleFrame copied to clipboard

pandas defaults in to_excel() not default?

Open buhtz opened this issue 3 years ago • 8 comments

I run into a problem while refactoring the first lines of styleframe.StyleFrame.to_excel():

https://github.com/DeepSpace2/StyleFrame/blob/c70243dc2add192b81d5fd0581e7566cbe135efd/styleframe/style_frame.py#L385-L390

The comment indicates that you do set the pandas default values here. Please take a look at index. Pandas default for index is True but you set it to False.

Is there a reason for this?

How can we deal with it? Do we need it to False or can we set it to pandas real default?

The variable index is used only when you call pandas.DataFrame.to_excel() and some lines later https://github.com/DeepSpace2/StyleFrame/blob/c70243dc2add192b81d5fd0581e7566cbe135efd/styleframe/style_frame.py#L450

And you "override" (shadow) that variable with a local one in https://github.com/DeepSpace2/StyleFrame/blob/c70243dc2add192b81d5fd0581e7566cbe135efd/styleframe/style_frame.py#L454 and https://github.com/DeepSpace2/StyleFrame/blob/c70243dc2add192b81d5fd0581e7566cbe135efd/styleframe/style_frame.py#L501

So I see no strict reason why the value shouldn't be set back to pandas default. Of course it will break the default behavior for some users. But IMHO this price should be payed. To be more user friendly a DeprecationWarning could be implemented in the next release.

buhtz avatar Jul 03 '22 07:07 buhtz

Is there a reason for this?

I'd guess the answer is "because that's the default I needed when started to develop this module more than 6 years ago" :)

I'm a bit reluctant to change it now, as you said it will break essentially all existing code. I feel that a deprecation process is a bit of an overkill for a change of a boolean default value.

It is possible to use functools.partialmethod (assuming Python >= 3.4) to prepare a to_excel version that as index=True as a "default" value.

from functools import partialmethod

from styleframe import StyleFrame

StyleFrame.to_excel = partialmethod(StyleFrame.to_excel, index=True)

StyleFrame({'a': ['a']}).to_excel('test.xlsx').save()

image

DeepSpace2 avatar Jul 08 '22 20:07 DeepSpace2

I don't see how partialmethod could improve the situation.

The point is that you have to make clear to the user when something is different from the pandas defaults.

But I have another idea and would provide a PR for this if you agree with that way. The current signature looks like this

def to_excel(self, excel_writer: Union[str, pd.ExcelWriter, pathlib.Path] = 'output.xlsx',
                 sheet_name: str = 'Sheet1', allow_protection: bool = False, right_to_left: bool = False,
                 columns_to_hide: Union[None, str, list, tuple, set] = None, row_to_add_filters: Optional[int] = None,
                 columns_and_rows_to_freeze: Optional[str] = None, best_fit: Union[None, str, list, tuple, set] = None,
                 **kwargs) -> pd.ExcelWriter:

The user knows that they can "modify" the default's from pandas.DataFrame.to_excel() via your **kwargs. Just "remove" index from **kwargs and add it to the signature list with your default value.

def to_excel(self, excel_writer: Union[str, pd.ExcelWriter, pathlib.Path] = 'output.xlsx',
                 sheet_name: str = 'Sheet1', allow_protection: bool = False, right_to_left: bool = False,
                 columns_to_hide: Union[None, str, list, tuple, set] = None, row_to_add_filters: Optional[int] = None,
                 columns_and_rows_to_freeze: Optional[str] = None, best_fit: Union[None, str, list, tuple, set] = None,
                 index=True,
                 **kwargs) -> pd.ExcelWriter:

This IMHO doesn't change the behavior or break something else. It just documents explicit that you use a default value different from pandas. What do you think?

buhtz avatar Jul 10 '22 13:07 buhtz

I might be missing something here, but that will change the default value from False to True. Every code that currently calls to_excel() without specifically setting index will generate a different output than it used to.

DeepSpace2 avatar Jul 10 '22 18:07 DeepSpace2

I might be missing something here, but that will change the default value from False to True. Every code that currently calls to_excel() without specifically setting index will generate a different output than it used to.

You are right. It was my fault (a typo). Of course I meant index=False which is your StyleFrame-default . I don't want to change the value of index.

buhtz avatar Jul 11 '22 13:07 buhtz

Ok, so you are suggesting the following changes:

https://github.com/DeepSpace2/StyleFrame/blob/c8a11172d84d4af6eab7c75e7563a00926267765/styleframe/style_frame.py#L338-L342

to be changed to

def to_excel(self, excel_writer: Union[str, pd.ExcelWriter, pathlib.Path] = 'output.xlsx',
                 sheet_name: str = 'Sheet1', allow_protection: bool = False, right_to_left: bool = False,
                 columns_to_hide: Union[None, str, list, tuple, set] = None, row_to_add_filters: Optional[int] = None,
                 columns_and_rows_to_freeze: Optional[str] = None, best_fit: Union[None, str, list, tuple, set] = None,
                 index=False,
                 **kwargs) -> pd.ExcelWriter:

and deleting https://github.com/DeepSpace2/StyleFrame/blob/c8a11172d84d4af6eab7c75e7563a00926267765/styleframe/style_frame.py#L387

just for self-documentation purposes, even though the documentation (and the docstring) say

to_excel also accepts all arguments that pandas.DataFrame.to_excel accepts as kwargs

I'll think about it.

DeepSpace2 avatar Jul 12 '22 17:07 DeepSpace2

There is a bit more I have done. The story is I tried to refactor the first lines of to_excel().

Can can take a look here https://github.com/Codeberg-AsGithubAlternative-buhtz/StyleFrame/tree/refactor_to_excel

See a new method for StyleFrame https://github.com/Codeberg-AsGithubAlternative-buhtz/StyleFrame/blob/f40b1d8c23513a1704b063219743fd70c04807ab/styleframe/style_frame.py#L338-L348 and first (refactored) lines of to_excel() https://github.com/Codeberg-AsGithubAlternative-buhtz/StyleFrame/blob/f40b1d8c23513a1704b063219743fd70c04807ab/styleframe/style_frame.py#L396-L402

And a new unittest to make sure this are really the pandas defaults and to detect if pandas change its defaults. https://github.com/Codeberg-AsGithubAlternative-buhtz/StyleFrame/blob/f40b1d8c23513a1704b063219743fd70c04807ab/styleframe/tests/style_frame_tests.py#L680-L698

I would PR this.

buhtz avatar Jul 13 '22 05:07 buhtz

Maybe it is easier to talk about when I create the PR? What do you think?

buhtz avatar Jul 16 '22 08:07 buhtz

Interesting. Please do create the PR and we can continue the discussion there. Thanks!

DeepSpace2 avatar Jul 22 '22 23:07 DeepSpace2

This issue has been automatically marked as stale because it has not had activity in the last 60 days.

stale[bot] avatar Sep 21 '22 00:09 stale[bot]

Merged #139

DeepSpace2 avatar Sep 30 '22 12:09 DeepSpace2