StatsBase.jl
StatsBase.jl copied to clipboard
Histogram intervals open at one end
trafficstars
using StatsBase
d = vcat(0, rand(10), 1)
fit(Histogram, d)
Histogram{Int64,1,Tuple{StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}}}
edges:
0.0:0.2:1.2
weights: [3, 4, 2, 1, 1, 1]
closed: left
isdensity: false
The 1 above ends up in a bin of its own. That means we get, say 4 bins covering the range [0, 1) and one bin that will only ever cover one value, which seems unbalanced.
I think it would be better if the left and right edges of the whole histogram range were closed. Thoughts?
As a data point: both R and Matlab have closed bins on the ends by default.