gluonts
gluonts copied to clipboard
best way for non-periodicity data?
Description
the base dataset class: PandasDataset(dataframes=code2df.values(), target="lastPrice", freq="3s", unchecked=True)
must input parameter "freq" or will be auto infered, and AddTimeFeatures
based on it.
I'm dealing with this:
timestamp: [
2010-10-11 09:15:00,
2010-10-11 09:15:03,
2010-10-11 09:15:06,
...,
2010-10-11 09:25:00,
2010-10-11 09:30:00,
2010-10-11 09:30:03,
...
]
Most of the data is at 3-second interval, But in some places, there are 5-minute intervals, such as 9:25 to 9:30.
I would like to know if there is an official way to handle this situation? Or maybe I need to define a set of methods myself
@NeoWang9999 was the data sampled at 3-second frequency, and some time windows happen to be missing? If that's the case, you probably want to just resample
your DataFrame to have 3s
frequency, and fill in missing observations with nan
For example using https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.asfreq.html
@NeoWang9999 was the data sampled at 3-second frequency, and some time windows happen to be missing? If that's the case, you probably want to just
resample
your DataFrame to have3s
frequency, and fill in missing observations withnan
thanks for reply!
it's not quite missing, but there actually is no value at some points.
I think this issue is very similar to mine: https://github.com/awslabs/gluonts/issues/227
@NeoWang9999 were you able to resolve the issue? Did you use any custom method? Looking forward to hearing
@NeoWang9999 were you able to resolve the issue? Did you use any custom method? Looking forward to hearing
yes, I did little code change, rewrite some functions. looks like no better way to me.