Statistics.jl
Statistics.jl copied to clipboard
median should take by/lt arguments similar to sort
I just found myself needing do take the median of some function of an array, and found that unfortunately the median
function does not take a by
keyword analogous to sort
. This would be nice to have.
Might make sense to have this for quantile
too.
Anything order-based, in fact. We should make sure that all order-related functions have this.
Couldn't generators replace this feature in a consistent fashion for all functions?
I thought we were just going to pass an ordering function? For type stability that can't be a keyword, though.
@nalimilan, generators can't be used for an in-place function like median!
.
If x
is a collection and foo
is a function then we can sort with sort!(x,by=foo)
. For simplicity let n=length(x)
be odd. My candidate for the median is x[n>>1 + 1]
(and not foo(x[n>>1 + 1])
). I don't see a type stability issue with the former.
I'm wondering if we should change the calculation of middle
for non-Number
types such that median(x) in x
since it might be possible to sort things for which you cannot compute averages, e.g. ['a','b','c','d']
. This is even more important with a by
keyword. Technically, both 'b'
and 'c'
are medians so we'd need to figure out which of them we prefer.
Updated to make sense.
I guess if you're returning the un-foo
ed value (which, duh, is what we do), then you're right that type stability isn't crucial. (You'll feel the lack for small inputs, though.)
Regarding type stability problems with the keywords - is it better to propagate the by
keyword or the By
type (and friends) to other functions throughout Base
? Will keyword argument type stability be fixed by 0.6? I kind-of liked By
... but I see it might just be a workaround.
@ALL @timholy @stevengj @StefanKarpinski @andreasnoack Is this issue still open? Can I work on this?
Please don't ping excessively.
Anyone can work on any issue.
Seems reasonable to me to ask if one is new to the project.
@christianbender You may find it easier to ask such questions on the project slack and even discuss as you work on a solution.
@KristofferC sorry for the annoy @ViralBShah
Thanks for help