ludwig Timeseries forecasting results inquiry

Hi all,

I'm having trouble understanding what the output represents in a timeseries forecasting experiment similar to this example. I'm using the following data and config file:

x, x_feature 9.8,9.8 9.8 10.0 7.2 8.4 9.9 9.5 9.8,9.8 9.8 10.0 7.2 8.4 9.9 9.5 10.0,9.8 9.8 10.0 7.2 8.4 9.9 9.5 7.2,9.8 9.8 10.0 7.2 8.4 9.9 9.5 8.4,9.8 9.8 10.0 7.2 8.4 9.9 9.5 9.9,9.8 9.8 10.0 7.2 8.4 9.9 9.5 9.5,9.8 9.8 10.0 7.2 8.4 9.9 9.5 9.3,9.8 9.8 10.0 7.2 8.4 9.9 9.5 9.2,9.8 10.0 7.2 8.4 9.9 9.5 9.3 9.7,10.0 7.2 8.4 9.9 9.5 9.3 9.2

config.yaml

input_features:
    -
        name: x_feature
        type: timeseries

output_features:
    -
        name: x
        type: numerical

Now, the model is returning results to the likes of "array([10.050287, 10.050287], dtype=float32)" in a numpy file.

So, a couple of questions:

What do these correspond to? Are they predictions of the next number that follows the sequence? If so, which one of the 10 sequences/x_features?
Why was I given two numbers when there was only one output feature?

Any guidance here is much appreciated.

Sep 14 '22 23:09 camaya7

Hi @camaya7, thanks for trying out time series with Ludwig!

I tried reproducing your issue using the following code, but did get the expected output:

from io import StringIO

import pandas as pd
import yaml
from ludwig.api import LudwigModel

data = """
x,x_feature
9.8,9.8 9.8 10.0 7.2 8.4 9.9 9.5
9.8,9.8 9.8 10.0 7.2 8.4 9.9 9.5
10.0,9.8 9.8 10.0 7.2 8.4 9.9 9.5
7.2,9.8 9.8 10.0 7.2 8.4 9.9 9.5
8.4,9.8 9.8 10.0 7.2 8.4 9.9 9.5
9.9,9.8 9.8 10.0 7.2 8.4 9.9 9.5
9.5,9.8 9.8 10.0 7.2 8.4 9.9 9.5
9.3,9.8 9.8 10.0 7.2 8.4 9.9 9.5
9.2,9.8 10.0 7.2 8.4 9.9 9.5 9.3
9.7,10.0 7.2 8.4 9.9 9.5 9.3 9.2
"""

df = pd.read_csv(StringIO(data))

config = yaml.safe_load(
    """
input_features:
    - name: x_feature
      type: timeseries

output_features:
    - name: x
      type: numerical
"""
)

model = LudwigModel(config=config)
model.train(dataset=df)
preds, _ = model.predict(dataset=df)
print(preds)

gives one prediction per row:

   x_predictions
0       8.186899
1       8.186899
2       8.186899
3       8.186899
4       8.186899
5       8.186899
6       8.186899
7       8.186899
8       8.274428
9       8.813780

Sep 16 '22 19:09 jppgks

Hi @jppgks,

Thanks for your reply and for clarifying the results. Including the prediction after training does the trick.

In terms of using the predictions, would it be better to train/predict on one row only in order to use that specific prediction/forecast for the feature? Excuse my ignorance but, what's the point in having a prediction for each row if the time series features contain roughly the same values from the sampled time values.

Also, when one goes about predicting at different horizons, how do you know which values must be included in the y1, y2, y3, etc. output features? Are these the previous predictions that are taken into account to forecast the subsequent values? It seems like the model would somehow need to store the previous predictions in order to use them for the next time horizon's.

Cheers and thanks for the help.

Sep 16 '22 20:09 camaya7

@camaya7 thanks for the question. Ludwig's way of dealing with time series is indeed peculiar, we will likely improve it in the future, but let me try to explain the way it works.

Let's assume we have a time series with values starting at 0 and increasing by 1 until 100, so 0, 1, 2, ..., 99, 100.

Usually, in a timeseries dataset, those values are provided all in a column. Ludwig's time series datatype assumes instead that each entry in a column is a list of numerical values. So the examples assume there's a function that widnows the data, and implicitly also creates targets.

For instance, if we apply a window of 5 elements of the above timeseries and consider only one output, we will have a dataset that looks like: x, y 0 1 2 3, 4 1 2 3 4, 5 ... 96 97 98 99, 100

Training on this means that at test time you will have to provide 4 values, the current last for in your current time series, to obtain the prediction of the following one.

if you want to use different time horizons, you can window your dataset like: x, y1, y2 0 1 2, 3, 4 1 2 3, 4, 5

If you have only one output time horizon and you want to use it for predicting a new time horizon, the process will look like:

datapoint = [0, 1, 2, 3]
pred = model.predict(datapoint)
new_datapoint = datapoint[1:] + [pred]
new_pred = model.predict(new_datapoint)

Does this help make things clearer?

Sep 17 '22 01:09 w4nderlust

Hi @w4nderlust, thanks for the explanation, it does ease things up in general. Specifically though, what would be a sample function that windows the data into the time series column?

I tried training with the following data: x ,y 0 1 2 3, 4 1 2 3 4, 5 2 3 4 5, 6

using:

model = LudwigModel(config=config)
model.train(dataset=df1)
preds, _ = model.predict(dataset=df1)
print(preds)

and got these results: x_predictions 0 0.005117 1 0.005408 2 0.005689

I'm just not sure what the numbers represent (the latest forecast or what not)? Does each row/set of time series values have its own forecast? Should the last one, ie, 96 97 98 99, 100 be used to obtain the latest prediction?

Sep 18 '22 22:09 camaya7

Each datapoint represents a pair of history and forecast. Not super sure why you are getting those predictions, a fully reproducible script / notebook would be great. Finally, for a function for windowing, The function here could be a good reference: http://ludwig.ai/0.5/examples/weather/

Sep 21 '22 17:09 w4nderlust

@w4nderlust thanks for that, understood. I was able to fix the configuration and get coherent predictions. The only question that remains is if there's any way to predict on two variables at the same time? So far I've had to train on each attribute separately.

Sep 22 '22 23:09 camaya7

@camaya7 great that you managed to get something reasonable :) Regarding the two variables, you can do it, and you can also have two outputs, so for instance your dataset can look like:

ts1_window, ts1_next, ts2_window, ts2_next 0 1 2 3, 4, 0.1 0.2 0.3, 0.4

Be sure to window them in the same way. You will also need to specify two input features and two output features in the config obviously.

You could do something similar with discrete sequences too, for instance if every entry corresponds to a day of the week you could do something like:

ts1_window, ts1_next, seq_window, seq_next 0 1 2 3, 4, Tue Wed Thu Fri, Sat

Finally, if the two features are aligned in the same way, you can also potentially use sequence combiners, which concatenate representations along the time dimension, checkup the docs for them.

Sep 23 '22 00:09 w4nderlust

@w4nderlust The multivariate model looks promising. After trying it out it's worth discussing if the approach you mentioned makes the variables endogenous:

Data ts1_window, ts1_next, ts2_window, ts2_next 1 2 3, 4, 0.1 0.2 0.3, 0.4 2 3 4, 5, 0.2 0.3 0.4, 0.5 3 4 5, 6, 0.3 0.4 0.5, 0.6 4 5 6, 7, 0.4 0.5 0.6, 0.7

Model config:

input_features:
    - 
      name: ts1_window
      type: timeseries
    - 
      name: ts2_window
      type: timeseries

output_features:
    - 
      name: ts1_next
      type: numerical
    - 
      name: ts2_next
      type: numerical

Results:

ts1_next_predictions, ts2_next_predictions 3.646097, 0.088812 5.000148, 0.129627 6.394808 , 0.170401 7.807065 , 0.209751

Now, I believe these are more exogenous results after the data was modeled separately for each variable i.e., one input feature/output feature for each variable:

ts1_next_predictions,ts2_next_predictions 3.429008, 0.354893 4.661938, 0.469404 5.931573, 0.587291 7.221491, 0.706770

The results vary. In this case, how can one trust the variables are being processed in an exogenous way since this is what's required?

Sep 23 '22 19:09 camaya7

I'd say that if you know that for sure, you should probably just build two separate models. On the other hand, if you believe that one can influence the other in positive ways, then modeling both at the same time is a good idea. In particular I obtained good results when there's one timeseries I'm trying to model and the other features are in support (like days of the week, or maybe binary indicators for specific things, like events or holidays).

Sep 27 '22 20:09 w4nderlust

@w4nderlust I'm going to continue developing both approaches. For now I'm preferring to build separate models as my features cannot influence each other although I see the potential for having just one model. Either way thank you for your responsiveness and help as always!

Sep 29 '22 00:09 camaya7

Glad to hear that you have clear steps forward @camaya7! I will close this issue, but feel free to open a new one or start a discussion if you have any other questions.

Sep 29 '22 09:09 jppgks

ludwig ludwig copied to clipboard

Timeseries forecasting results inquiry

ludwig
ludwig copied to clipboard