StatsBase.jl icon indicating copy to clipboard operation
StatsBase.jl copied to clipboard

Add lightweight unicode plot to Histogram show for text displays?

Open oschulz opened this issue 2 years ago • 7 comments

Something like this

function Base.show(io::IO, ::MIME"text/plain", h::Histogram{<:Real,1})
    compact = get(io, :compact, false)
    edge = first(h.edges)
    if edge isa AbstractRange && length(eachindex(edge)) <= 120
        if !compact
            show(io, h)
            println(io)
        end
        W = h.weights
        barsyms = [' ', '▁', '▂', '▃', '▄', '▅', '▆', '▇', '█']
        symidxs = eachindex(barsyms)
        norm_factor = length(symidxs) / maximum(W)
        get_sym_idx(x) = isnan(x) ? 1 : clamp(first(symidxs) + floor(Int, norm_factor * x), first(symidxs), last(symidxs))
        print(io, minimum(edge))
        print(io, h.closed == :left ? "[" : "]")
        print(io, String(barsyms[get_sym_idx.(W)]))
        print(io, h.closed == :right ? "]" : "[")
        print(io, maximum(edge))
    else
        show(io, h)
    end
end

would get us a neat, lightweight display for 1D-histograms:

julia> fit(Histogram, randn(10^4), nbins = 60)
Histogram{Int64, 1, Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}}
edges:
  -3.6:0.2:3.6
weights: [1, 2, 6, 10, 18, 33, 57, 103, 120, 178  …  190, 126, 93, 55, 27, 21, 22, 5, 5, 2]
closed: left
isdensity: false
-3.6[       ▁▁▁▂▄▄▅▆▇████▇▆▅▄▃▂▂▁▁       [3.6

julia> [fit(Histogram, randn(10^4), nbins = 60) for i in 1:2, j in 1:5]
2×5 Matrix{Histogram{Int64, 1, Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}}}:
 -3.8[         ▁▁▂▃▄▅▇▇████▇▇▅▄▃▃▂▁            [4.4  …  -4.0[         ▁▁▁▂▃▅▆▆▇█████▆▅▅▃▂▂▁          [4.0
 -4.0[          ▁▁▂▃▄▅▆██████▆▆▄▄▃▁▁▁          [4.2        -3.8[        ▁▁▂▂▃▄▅▆▇███▇▇▆▅▄▃▂▂▁          [4.0

Instead of limiting to 120 bins like above we could rebin "long" histograms.

For even fancier multi-line display we could also borrow the unicode-histogram code from BenchmarkTools.

If acceptable, I would do a PR.

oschulz avatar Apr 12 '22 18:04 oschulz

I'm always skeptical of statistics libraries that also incorporate plotting.

This seems like it might be annoying to have as an output (I'm not a fan of the new @benchmark histograms). Given that UnicodePlots.jl is such a lightweight dependency, I'm not sure this is needed.

pdeffebach avatar Apr 20 '22 09:04 pdeffebach

UnicodePlots.jl is such a lightweight dependency

I would say this is debatable. :-) UnicodePlots.jl currently loads 50 dependencies (direct and indirect) with a load time of about 2 seconds.

StatsBase.jl, in comparison, loads 13 dependencies (direct and indirect) with a load time of about 0.2 seconds.

oschulz avatar Apr 21 '22 20:04 oschulz

Well, even more than that, StatsBase.jl is a dependency of unicode plots!

Fair enough. I still think the histograms are a cluttered. Maybe we can add a summary method which prints the output? Maybe quantiles.

pdeffebach avatar Apr 22 '22 15:04 pdeffebach

I guess it's a matter of preference - packages like Colors.jl do a graphical display as well (only in HTML, in that case) after all. I would enjoy seeing 1D-histograms "directly", but I can understand others might prefer text-only.

@nalimilan , what's your take on this?

oschulz avatar Apr 22 '22 19:04 oschulz

CC @andreasnoack

oschulz avatar Apr 22 '22 19:04 oschulz

I'm hesitant. This looks nice, but I wonder whether it's really useful given that there's no indication of the ticks on the x axis.

nalimilan avatar Apr 23 '22 20:04 nalimilan

I wonder whether it's really useful given that there's no indication of the ticks on the x axis

True, it wouldn't be a plot that gives detailed information. But it would, I think, provide more information at a glance than the current purely text-based summary, and could even be more compact.

oschulz avatar Apr 23 '22 21:04 oschulz