ShiftedArrays.jl icon indicating copy to clipboard operation
ShiftedArrays.jl copied to clipboard

Add `ShiftedArrays.diff` for differences between elements in a vector

Open pdeffebach opened this issue 4 years ago • 7 comments

Base.diff does not return a vector of the same size as the original. I find this very frustrating, as I express in #41.

This PR adds ShiftedArrays.diff as a function to get the difference between elements, but pre-pends missing at the first element.

I understand that ShiftedArrays.diff is more characters than x .- lag(x), but it's more declarative.

pdeffebach avatar Oct 05 '21 15:10 pdeffebach

I agree we really need this, but it's not specific to ShiftedArrays at all since it returns a plain Array. Maybe we could add an argument like default to diff in Base to enable this behavior? At least it's probably worth trying.

nalimilan avatar Oct 05 '21 17:10 nalimilan

Filed an issue here

pdeffebach avatar Oct 05 '21 19:10 pdeffebach

I also think it's a bit odd that this should live here. IMO, the idea behind ShiftArrays.circshift versus Base.circshift (or fftshift and co.) is to return a lazy object with the same content. It's the same logic underlying Iterators.map versus Base.map.

Returning an eager result whose value differs from Base.diff is a bit odd, it would be more logical to add options to diff. That being said, I agree that the functionality is useful, I'm just unsure it should be called ShiftedArrays.diff.

piever avatar Oct 05 '21 20:10 piever

Ah yeah so you mean it would make sense to have ShiftedArrays.diff return a lazy array instead, with arguments to choose whether the first value should be missing or not?

nalimilan avatar Oct 05 '21 20:10 nalimilan

If it's performant to have the object be lazy, and if it takes advantage of the same infrastructure as lag, then it should definitely live here, right?

@piever do you want to experiment with making this be lazy? Or should I modify the PR to make it return something which looks like what lag returns.

pdeffebach avatar Oct 05 '21 20:10 pdeffebach

Ah yeah so you mean it would make sense to have ShiftedArrays.diff return a lazy array instead, with arguments to choose whether the first value should be missing or not?

I think it'd make sense for it to return a lazy object, as for instance

diff(v, n = 1; default) = Broadcast.broadcasted(-, v, lag(v, n; default))

My main concern was that collect(ShiftedArrays.diff(v)) != diff(v), whereas for all the other functions shadowed here (circshift, fftshit, ifftshift) that holds. Maybe that's not so bad with the mandatory keyword argument. I imagine if diff(v, default=missing) were to be added to Base, it would have the same meaning as the definition above but eager.

piever avatar Oct 05 '21 21:10 piever

Okay, I see your point about consistency with Base. Sorry for missing it earlier. If possible can you show your support in the linked issue? Would be good to coalesce these two discussions into one.

pdeffebach avatar Oct 05 '21 23:10 pdeffebach