orange3 icon indicating copy to clipboard operation
orange3 copied to clipboard

Group By: Add quartile outputs

Open chourroutm opened this issue 2 years ago • 3 comments

What's your use case?

I would like to summarize large datasets through the 5-number set (minimum, Q1, median, Q3, maximum) + mean, but quartiles are not available in the Group By GUI.

What's your proposed solution?

Integrate quartile computations and 1 or 2 check boxes ("Quartiles" or "First quartile" / "Third quartile") in the GUI of the Group By widget.

Are there any alternative solutions?

These can be computed with a custom python script using pandas in the meantime.

chourroutm avatar Jul 18 '22 13:07 chourroutm

I think this could be added to either Box Plot or Feature Statistics (which already outputs Statistics) instead. Box Plot could indeed have a Statistics output, since the widget already computes all the necessary information.

ajdapretnar avatar Jul 18 '22 14:07 ajdapretnar

Indeed, it might be a good addition! Still, in my use case, I need per-class statistics, and I thought it would make sense in this widget, as it already computes multiple distribution statistics

chourroutm avatar Jul 21 '22 09:07 chourroutm

It turns out Box Plot computes these statistics on the fly, so that's not really a good option. Feature Statistics might work, but yeah, I guess adding it to Group By cannot hurt.

ajdapretnar avatar Jul 21 '22 10:07 ajdapretnar

Decision: We'll do it. The first column will contain Mean, Median, Q1, Q3, Min, Max.

janezd avatar Jan 13 '23 09:01 janezd