OnlineStats.jl
OnlineStats.jl copied to clipboard
Pretty printing is unpretty inside DataFrame
using DataFrames, CSV, OnlineStats, Statistics
url = "https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv"
df = DataFrame(CSV.File(download(url)))
combine(df, [:sepal_length, :sepal_width] .=> (a -> fit!(Mean(), a)))
shows the following output on the Julia REPL
1×2 DataFrame
Row │ sepal_length_function sepal_width_function
│ Mean… Mean…
─────┼──────────────────────────────────────────────────────────────────────
1 │ Mean\e[90m: \e[39mn=150\e[90m |\… Mean\e[90m: \e[39mn=150\e[90m |\…
Similar for other Monoids, like Extrema.
It would be great, if OnlineStats are readable inside a DataFrame (they are kind of useless without).
I think the solution is to change show methods that rely on printstyled to use StyledStrings.jl
Actually this looks to be an upstream issue: https://github.com/ronisbr/PrettyTables.jl/issues/244. Closing here.
Hi!
IMHO, there is a problem here on how the objects are printed. All the objects seem to be printed to stdout using colors if it supports. However, this approach is wrong. For example, the object Mean provided the following result when calling with print:
Notice that the output is decorated. However, print definition is:
print([io::IO], xs...)
Write to io (or to the default output stream stdout if io is not given)
a canonical (un-decorated) text representation. The representation used
by print includes minimal formatting and tries to avoid Julia-specific
details.
Hence, the output must have no colors, breaklines, etc.
When we are defining a type and want to provide a custom method, we usually add two functions:
function show(io::IO, obj::MyType)
function show(io::IO, mime::MIME"text/plain", obj::MyType)
The first is the fallback used for print. Thus, we must provide an undecorated text representation. The second is used for the decorated version if the IO supports colors (which we must check).
PrettyTables.jl uses print to obtain the text representation of objects. Thus, this function is sending something that we would not expecting given the definition of print. That's why you are seeing this behavior.
However, there is an easy way to circumvent this but uncommon to be used in DataFrames. If you wrap a cell in a AnsiTextCell, PrettyTables.jl will automatically render those ANSI escape sequences. Hence, we need to obtain the string representation of the objects and put them in an AnsiTextCell:
julia> combine(
df,
[:sepal_length, :sepal_width] .=> (a -> begin
AnsiTextCell(sprint(print, fit!(Mean(), a); context = :color => true))
end)
)
Notice that it will even work with breaklines:
However, in this case using a DataFrame is useless because you do not have the original objects, just strings. If you want just to show the information, maybe using PrettyTables.jl directly is better.
@ronisbr Thanks for the details! That helped connect some dots I was missing. Sorry for the noise.
There's no problem at all @joshday ! Let me know if I can help.
Off-topic: By the way, I did not know this amazing package! I will start to use it immediately :)