StyleFrame
StyleFrame copied to clipboard
pandas defaults in to_excel() not default?
I run into a problem while refactoring the first lines of styleframe.StyleFrame.to_excel():
https://github.com/DeepSpace2/StyleFrame/blob/c70243dc2add192b81d5fd0581e7566cbe135efd/styleframe/style_frame.py#L385-L390
The comment indicates that you do set the pandas default values here. Please take a look at index.
Pandas default for index is True but you set it to False.
Is there a reason for this?
How can we deal with it? Do we need it to False or can we set it to pandas real default?
The variable index is used only when you call pandas.DataFrame.to_excel() and some lines later
https://github.com/DeepSpace2/StyleFrame/blob/c70243dc2add192b81d5fd0581e7566cbe135efd/styleframe/style_frame.py#L450
And you "override" (shadow) that variable with a local one in https://github.com/DeepSpace2/StyleFrame/blob/c70243dc2add192b81d5fd0581e7566cbe135efd/styleframe/style_frame.py#L454 and https://github.com/DeepSpace2/StyleFrame/blob/c70243dc2add192b81d5fd0581e7566cbe135efd/styleframe/style_frame.py#L501
So I see no strict reason why the value shouldn't be set back to pandas default. Of course it will break the default behavior for some users. But IMHO this price should be payed. To be more user friendly a DeprecationWarning could be implemented in the next release.
Is there a reason for this?
I'd guess the answer is "because that's the default I needed when started to develop this module more than 6 years ago" :)
I'm a bit reluctant to change it now, as you said it will break essentially all existing code. I feel that a deprecation process is a bit of an overkill for a change of a boolean default value.
It is possible to use functools.partialmethod (assuming Python >= 3.4) to prepare a to_excel version that as index=True as a "default" value.
from functools import partialmethod
from styleframe import StyleFrame
StyleFrame.to_excel = partialmethod(StyleFrame.to_excel, index=True)
StyleFrame({'a': ['a']}).to_excel('test.xlsx').save()

I don't see how partialmethod could improve the situation.
The point is that you have to make clear to the user when something is different from the pandas defaults.
But I have another idea and would provide a PR for this if you agree with that way. The current signature looks like this
def to_excel(self, excel_writer: Union[str, pd.ExcelWriter, pathlib.Path] = 'output.xlsx',
sheet_name: str = 'Sheet1', allow_protection: bool = False, right_to_left: bool = False,
columns_to_hide: Union[None, str, list, tuple, set] = None, row_to_add_filters: Optional[int] = None,
columns_and_rows_to_freeze: Optional[str] = None, best_fit: Union[None, str, list, tuple, set] = None,
**kwargs) -> pd.ExcelWriter:
The user knows that they can "modify" the default's from pandas.DataFrame.to_excel() via your **kwargs.
Just "remove" index from **kwargs and add it to the signature list with your default value.
def to_excel(self, excel_writer: Union[str, pd.ExcelWriter, pathlib.Path] = 'output.xlsx',
sheet_name: str = 'Sheet1', allow_protection: bool = False, right_to_left: bool = False,
columns_to_hide: Union[None, str, list, tuple, set] = None, row_to_add_filters: Optional[int] = None,
columns_and_rows_to_freeze: Optional[str] = None, best_fit: Union[None, str, list, tuple, set] = None,
index=True,
**kwargs) -> pd.ExcelWriter:
This IMHO doesn't change the behavior or break something else. It just documents explicit that you use a default value different from pandas. What do you think?
I might be missing something here, but that will change the default value from False to True. Every code that currently calls to_excel() without specifically setting index will generate a different output than it used to.
I might be missing something here, but that will change the default value from
FalsetoTrue. Every code that currently callsto_excel()without specifically settingindexwill generate a different output than it used to.
You are right. It was my fault (a typo). Of course I meant index=False which is your StyleFrame-default . I don't want to change the value of index.
Ok, so you are suggesting the following changes:
https://github.com/DeepSpace2/StyleFrame/blob/c8a11172d84d4af6eab7c75e7563a00926267765/styleframe/style_frame.py#L338-L342
to be changed to
def to_excel(self, excel_writer: Union[str, pd.ExcelWriter, pathlib.Path] = 'output.xlsx',
sheet_name: str = 'Sheet1', allow_protection: bool = False, right_to_left: bool = False,
columns_to_hide: Union[None, str, list, tuple, set] = None, row_to_add_filters: Optional[int] = None,
columns_and_rows_to_freeze: Optional[str] = None, best_fit: Union[None, str, list, tuple, set] = None,
index=False,
**kwargs) -> pd.ExcelWriter:
and deleting https://github.com/DeepSpace2/StyleFrame/blob/c8a11172d84d4af6eab7c75e7563a00926267765/styleframe/style_frame.py#L387
just for self-documentation purposes, even though the documentation (and the docstring) say
to_excelalso accepts all arguments thatpandas.DataFrame.to_excelaccepts as kwargs
I'll think about it.
There is a bit more I have done. The story is I tried to refactor the first lines of to_excel().
Can can take a look here https://github.com/Codeberg-AsGithubAlternative-buhtz/StyleFrame/tree/refactor_to_excel
See a new method for StyleFrame
https://github.com/Codeberg-AsGithubAlternative-buhtz/StyleFrame/blob/f40b1d8c23513a1704b063219743fd70c04807ab/styleframe/style_frame.py#L338-L348
and first (refactored) lines of to_excel()
https://github.com/Codeberg-AsGithubAlternative-buhtz/StyleFrame/blob/f40b1d8c23513a1704b063219743fd70c04807ab/styleframe/style_frame.py#L396-L402
And a new unittest to make sure this are really the pandas defaults and to detect if pandas change its defaults. https://github.com/Codeberg-AsGithubAlternative-buhtz/StyleFrame/blob/f40b1d8c23513a1704b063219743fd70c04807ab/styleframe/tests/style_frame_tests.py#L680-L698
I would PR this.
Maybe it is easier to talk about when I create the PR? What do you think?
Interesting. Please do create the PR and we can continue the discussion there. Thanks!
This issue has been automatically marked as stale because it has not had activity in the last 60 days.
Merged #139