StatsBase.jl icon indicating copy to clipboard operation
StatsBase.jl copied to clipboard

Histogram intervals open at one end

Open cmcaine opened this issue 5 years ago • 1 comments
trafficstars

using StatsBase

d = vcat(0, rand(10), 1)
fit(Histogram, d)
Histogram{Int64,1,Tuple{StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}}}
edges:
  0.0:0.2:1.2
weights: [3, 4, 2, 1, 1, 1]
closed: left
isdensity: false

The 1 above ends up in a bin of its own. That means we get, say 4 bins covering the range [0, 1) and one bin that will only ever cover one value, which seems unbalanced.

I think it would be better if the left and right edges of the whole histogram range were closed. Thoughts?

cmcaine avatar Apr 20 '20 15:04 cmcaine

As a data point: both R and Matlab have closed bins on the ends by default.

joshday avatar Apr 29 '20 19:04 joshday