TimeSeries.jl icon indicating copy to clipboard operation
TimeSeries.jl copied to clipboard

Use Nullable Array package

Open milktrader opened this issue 9 years ago • 9 comments

This has just gotten registered so just want to throw this out for possible implementation down the road. https://github.com/JuliaStats/NullableArrays.jl

milktrader avatar Sep 21 '15 17:09 milktrader

Anyone working on this right now? I need left/right-joins and it would make sense to implement this first - I can do it this weekend if no one else has started on it.

GordStephen avatar Oct 16 '15 17:10 GordStephen

On second thought, this may have farther-reaching consequences than I thought, so I might just implement what I need with NaNs for now... I'll push my work to a new branch for reference, although I know the convention so far has been to avoid intentionally introducing NaNs, so we may not want to merge it until we can do it properly with Nullables.

GordStephen avatar Oct 16 '15 19:10 GordStephen

Yeah, I was hoping with Julia 0.4.0 and Nullable it would be simple and easy from there. Turns out there is a performance hit. I'm in favor of using NaN for now, and making this part of version 0.7.0 changes.

The way to not disrupt the API too much by introducing NaN sentinels is to make it a kwarg I think, with the default being no padding with sentinels, since that's been the behavior so far.

milktrader avatar Oct 19 '15 13:10 milktrader

So, for example, lead/lag would gain padding=false keyword args? I like that.

GordStephen avatar Oct 21 '15 13:10 GordStephen

I worked on developing the NullableArrays package. I'm happy to help navigate performance issues, if I can. If there are requests for features, please do file an issue.

davidagold avatar Oct 21 '15 13:10 davidagold

Thanks @davidagold. Can you update us on how the package development is coming along. We're about to do some other important changes that but the implementation of missing/unknown/consumed values is coming up next.

milktrader avatar Oct 22 '15 01:10 milktrader

The package currently is more or less at feature parity with DataArrays, modulo the PooledDataArray interface/functionality of the latter. That functionality is to be refactored into a different package that will build support for factors on top of NullableArrays.

davidagold avatar Oct 23 '15 03:10 davidagold

You should probably support both standard Array and NullableArray, as the latter will always have a performance hit. When people don't need missing values, they would use the former.

nalimilan avatar Dec 17 '15 09:12 nalimilan

Do you mean enhance the existing TimeArray type's value element or a new type?

milktrader avatar Dec 18 '15 14:12 milktrader