ldcpy icon indicating copy to clipboard operation
ldcpy copied to clipboard

Fix group_by issues

Open pinarda opened this issue 5 years ago • 3 comments

we want to group_by before we calculate metrics, not after. We could possible add an option to do either, but see odds_positive and standardized_mean where the values are very different depending on which order they are calculated in.

pinarda avatar Oct 10 '20 21:10 pinarda

@pinarda it looks like right now we do the group_by before the metric for odds_positive and standardized_mean (why not odds_negative?),

and everything else does the metric first then the group by. Is this request saying that you always want the group by to happen first?

Also this only affects timeseries plots right?

allibco avatar Mar 18 '21 17:03 allibco

ignore the odds_negative comment - doesn't exist :)

allibco avatar Mar 18 '21 17:03 allibco

See odds_positive and standardized_mean metrics, we group_by first because we need an intermediate metric that must be calculated per group (mean, std etc.). Otherwise on line 196 of plot.py we end up grouping after the metric is computed. This may return the same value as grouping before we calculate the metric for most metrics (check that this is true).

pinarda avatar Mar 29 '21 17:03 pinarda