pandas icon indicating copy to clipboard operation
pandas copied to clipboard

PDEP-3: Small data visualization consolidation

Open attack68 opened this issue 3 years ago • 5 comments
trafficstars

This is the official Pdep and design objectives I have been unofficially publishing PRs towards.

Some items like reimplementing DataFrame.to_html and DataFrame.to_latex to use jinja2 suffer from ad-hoc and inconsistent PRs and PR reviews. It seems appropriate for a wider discussion and either adoption or rejection of these concepts.

attack68 avatar Aug 14 '22 13:08 attack68

can u comment on why the methods which do not support metadata (to_string,to_csv,to_json) should be included here?

sure there is some overlap but these miss a lot of the fundamental features of html,latex,excel

jreback avatar Aug 14 '22 13:08 jreback

For to_csv and to_json is the idea here that the styler would be useful to just str-format particular columns? Or is there a larger vision you have in mind?

WillAyd avatar Aug 14 '22 17:08 WillAyd

Many of the features in Styler are relatively new. With improved documentation, I believe there is a reasonable chance that these kinds of requests will eventually come.

Styler.concat, which is not even released (in 1.5.0), has the potential to get reasonable usage. It is a very easy way to combine data output, for the output methods that exist exclusively for Styler.

Styler formatting features are now quite extensive. More customisation is available compared to regular DataFrame formatting on output (which DataFrame.to_json does not even have implemented), e.g. latex escaping

I raised to_json as one of the first things I observed, #36680, and there is support from, what appears to be, a web-designer clique (which was also my use case).

Console printing for Styler (to_string) is useful to developers not using Jupyter.

Finally, json is quite a flexible format. Currently there is no plan to extend meta data tagging to_json, but there is maybe a way to leverage Styler to include it. I did not envisage leveraging Styler to write to_latex but after it was requested we got it working and it now feels reasonably popular for those in academia.

If we have to take something off the list I'd rather it be to_csv.

attack68 avatar Aug 14 '22 20:08 attack68

my_styler = GenericTableStyler()

my_styler.precision = 0
my_styler.na_rep = 'MISSING'
my_styler.thousands = ','
my_styler.formatter['Decision Tree'] = '{:.2f}'
# ...

Do you know this is available in a reasonably similar way with,

from pandas.io.formats.style import Styler
pd.options.styler.format.precision = 0
pd.options.styler.format.thousands = ","
pd.options.styler.format.na_rep = 'MISSING'

my_styler = Styler(df)

If Styler was decoupled from pandas what would be the fall back for a method like DataFrame.to_latex and Dataframe.to_html, if a 'styler' wasn't provided? Would that mean the function has no way to generate HTML? Or would it still be dependent upon the LaTeX and HTML formatters, which the PDEP discusses removing becuase, they have their own issues which have not been addressed or maintained for a while?

attack68 avatar Aug 18 '22 12:08 attack68

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

github-actions[bot] avatar Sep 18 '22 00:09 github-actions[bot]

@attack68 what's the status here?

simonjayhawkins avatar Feb 22 '23 15:02 simonjayhawkins

Seems like the discussion and development here has stalled. Closing for now, but happy to reopen if we want to circle back on this pdep

mroeschke avatar Mar 24 '23 17:03 mroeschke