pydsstools icon indicating copy to clipboard operation
pydsstools copied to clipboard

Why is `trim_missing=True` the default method `read_ts` ?

Open danhamill opened this issue 3 years ago • 2 comments

Having been caught up on this in several projects, what is the rational for having trim_missing=True in the read_ts method as the default behavior?

In my mind, zero is a valid measurement.

danhamill avatar May 10 '22 22:05 danhamill

Hello,

The trim missing won't (or at least shouldn't) remove values of 0 as they aren't considered missing in DSS files. Missing values are specifically indicated by -901 and -902 in DSS (1). I believe it will also work based on a quality flag which it would then remove regardless of value (our SQL database works that way at least I but I don't think I've tested that in DSS.) If you're having "0"s removed it could be worth checking that there aren't any quality flags that got erroneously added.

The primary purpose is to reduce data transfer amounts. If you're transferring data and there's a lot missing past either extent, and the given purpose would already deal with all the date/time pairs correctly it's convenient.

That said, I agree it should be false by default to always return everything in the window so you know what regular interval data is actually present vs supposed to be there. And an application should assert otherwise.

Mike

(1) Yes, that will cause problems storing differences that get that large.

On Tue, May 10, 2022 at 3:25 PM Daniel Hamill @.***> wrote:

Having been caught up on this in several projects, what is the rational for having trim_missing=True in the read_ts method as the default behavior?

In my mind, zero is a valid measurement.

— Reply to this email directly, view it on GitHub https://github.com/gyanz/pydsstools/issues/40, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB44KCCXKQN4L7TQ34IFE2LVJLO6VANCNFSM5VTAIC2A . You are receiving this because you are subscribed to this thread.Message ID: @.***>

MikeNeilson avatar May 13 '22 00:05 MikeNeilson

I will have to work out an example to show the unexpected behavior, but I am seeing zeros (not noData or -901.0) being dropped from the beginning or end of a time series with the default trim_missing=True argument.

If a user goes through the effort to provide and exact window to read_ts, I also think the default behavior should be to return everything contained in the window.

danhamill avatar May 13 '22 22:05 danhamill