OnlineStats.jl icon indicating copy to clipboard operation
OnlineStats.jl copied to clipboard

Multiple statistics with multiple variables

Open Moelf opened this issue 2 years ago • 4 comments

I know fit!() works with an iterator, but what if I need to return multiple pairs of (value, weight) from the iterator because I want to make many weighted histograms in one pass of the data?

Moelf avatar Aug 27 '21 13:08 Moelf

Can you clarify a bit? I don't think I follow.

joshday avatar Aug 27 '21 14:08 joshday

taking this example from docs:

itr = (row.variety => parse(Float64, row.sepal_length) for row in rows)

o = GroupBy(String, Hist(4:0.25:8))

fit!(o, itr)

What if:

  1. each observation from itr has a weight (histogram filling weight)
  2. Each Histogram has different binning (say "Setosa" has 4:0.5:8, and "Virginica" has 6:0.25:8)

Moelf avatar Aug 27 '21 14:08 Moelf

You may have to roll a few things on your own.

Also, I've been meaning to work on StatsBase-like weights for OnlineStats so maybe this will nudge me to do it.

joshday avatar Aug 28 '21 13:08 joshday

in case you want wheels: https://github.com/Moelf/FHist.jl/blob/0c3dfdf118600507fa6a38aa0208a855d1347fa3/src/hist1d.jl#L90

Moelf avatar Aug 28 '21 13:08 Moelf