StatsBase.jl
StatsBase.jl copied to clipboard
Add lightweight unicode plot to Histogram show for text displays?
Something like this
function Base.show(io::IO, ::MIME"text/plain", h::Histogram{<:Real,1})
compact = get(io, :compact, false)
edge = first(h.edges)
if edge isa AbstractRange && length(eachindex(edge)) <= 120
if !compact
show(io, h)
println(io)
end
W = h.weights
barsyms = [' ', '▁', '▂', '▃', '▄', '▅', '▆', '▇', '█']
symidxs = eachindex(barsyms)
norm_factor = length(symidxs) / maximum(W)
get_sym_idx(x) = isnan(x) ? 1 : clamp(first(symidxs) + floor(Int, norm_factor * x), first(symidxs), last(symidxs))
print(io, minimum(edge))
print(io, h.closed == :left ? "[" : "]")
print(io, String(barsyms[get_sym_idx.(W)]))
print(io, h.closed == :right ? "]" : "[")
print(io, maximum(edge))
else
show(io, h)
end
end
would get us a neat, lightweight display for 1D-histograms:
julia> fit(Histogram, randn(10^4), nbins = 60)
Histogram{Int64, 1, Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}}
edges:
-3.6:0.2:3.6
weights: [1, 2, 6, 10, 18, 33, 57, 103, 120, 178 … 190, 126, 93, 55, 27, 21, 22, 5, 5, 2]
closed: left
isdensity: false
-3.6[ ▁▁▁▂▄▄▅▆▇████▇▆▅▄▃▂▂▁▁ [3.6
julia> [fit(Histogram, randn(10^4), nbins = 60) for i in 1:2, j in 1:5]
2×5 Matrix{Histogram{Int64, 1, Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}}}:
-3.8[ ▁▁▂▃▄▅▇▇████▇▇▅▄▃▃▂▁ [4.4 … -4.0[ ▁▁▁▂▃▅▆▆▇█████▆▅▅▃▂▂▁ [4.0
-4.0[ ▁▁▂▃▄▅▆██████▆▆▄▄▃▁▁▁ [4.2 -3.8[ ▁▁▂▂▃▄▅▆▇███▇▇▆▅▄▃▂▂▁ [4.0
Instead of limiting to 120 bins like above we could rebin "long" histograms.
For even fancier multi-line display we could also borrow the unicode-histogram code from BenchmarkTools.
If acceptable, I would do a PR.
I'm always skeptical of statistics libraries that also incorporate plotting.
This seems like it might be annoying to have as an output (I'm not a fan of the new @benchmark
histograms). Given that UnicodePlots.jl is such a lightweight dependency, I'm not sure this is needed.
UnicodePlots.jl is such a lightweight dependency
I would say this is debatable. :-) UnicodePlots.jl currently loads 50 dependencies (direct and indirect) with a load time of about 2 seconds.
StatsBase.jl, in comparison, loads 13 dependencies (direct and indirect) with a load time of about 0.2 seconds.
Well, even more than that, StatsBase.jl is a dependency of unicode plots!
Fair enough. I still think the histograms are a cluttered. Maybe we can add a summary method which prints the output? Maybe quantiles.
I guess it's a matter of preference - packages like Colors.jl do a graphical display as well (only in HTML, in that case) after all. I would enjoy seeing 1D-histograms "directly", but I can understand others might prefer text-only.
@nalimilan , what's your take on this?
CC @andreasnoack
I'm hesitant. This looks nice, but I wonder whether it's really useful given that there's no indication of the ticks on the x axis.
I wonder whether it's really useful given that there's no indication of the ticks on the x axis
True, it wouldn't be a plot that gives detailed information. But it would, I think, provide more information at a glance than the current purely text-based summary, and could even be more compact.