AutoTS Question: Multi-Objective Forecasting

Problem: A dataset with multiple data columns that may or may not be temporally coupled, and also multiple output sets. e.g. market sectors and wellbeing index as columns, see if one index is tied to the rest of the indices.

Aug 06 '22 10:08 BradKML

I don't entirely understand what you are asking for. Could you explain a bit more? If you are trying to figure out if multiple features are correlated (pandas correlation) or if there is casuality (Granger Casuality) you don't need AutoTS for that. You might want to look at Statsmodel, you can run something like a VAR or VECM and see the significance of coefficients.

Aug 06 '22 20:08 winedarksea

@winedarksea what if you have multiple items to forecast based on a TimeSeries, and not just a single item? Other than correlation or causality, are there algorithms possible to do this, and thus AutoTS?

Aug 08 '22 02:08 BradKML

Yes, you can predict multiple items at the same time. Usually we use the word 'multivariate' forecasting or in the more general case 'multioutput' regressions. There are two basic ways of inputting the information:

your input dataframe which contains all history for 1 to many input variables. You can adjust weights to show which series you care about forecasting (set 0 for series that are input only, with no desired forecast output). All example datasets are multivariate so you can follow those examples.
then there are future_regressors which are features you will know about in advance (for example, in sales forecasting, how many hours the store will be open on future days, how many employees are scheduled to work, etc) or a forward lagged version of other features.

You will run into challenges if your multiple items don't align very well. You might need to resample them to the same frequency. Not all models use the multivariate information (see the table at the end of the extended_tutorial.md).

Probably the simplest thing to do is to follow through and examine an example like production_example.py here and see how it handles the data.

Aug 08 '22 13:08 winedarksea

You will run into challenges if your multiple items don't align very well.

Other than this problem for some data (e.g. matching weekly vs daily data, or those data with large gaps of data), it is really useful (e.g. stock data or live weather/traffic data).

I would assume that the example is this, right? https://github.com/winedarksea/AutoTS/blob/master/production_example.py

Further Q:

How does training, testing, and validation works if the dataset is long, and it is possible to do either full-memory backtesting or sliding windows? Also how can the gap of testing and validation windows compare?
Are there any other in-built API for load_live_daily for Google trends or Crypto? https://makersportal.com/blog/2020/1/19/google-trends-x-yahoo-finance

Aug 18 '22 07:08 BradKML

That production example is my approach to it, although not the only possible way to do things.

Validation works on windows of length equal to the forecast length. How it takes those windows is based on the validation_method which has backwards (progressively backwards slices), similarity (fancier, windows chosen based on similarity of a distance metric), even (like slices of a pie) and seasonal (backwards with specified spacing). There's also an option to pass a list of custom indices to use for validation. Probably worth looking at the model.validation_train_indexes and model.validation_test_indexes for your runs to see how it has made the splits.
pytrends is built in for Google Trends, and yfinance is there too (same Yahoo Finance data, but different package than you link). You will need to make sure to install those packages, then specify trends_list and tickers respectively as arguments. Also install fredapi and pass a fred_key as those series are often helpful. Let me know if you find any other live free data sources that you would like to add, I would be happy to add them. Running help(load_live_daily) should print some of the args for that

Aug 18 '22 15:08 winedarksea

Currently reading through https://winedarksea.github.io/AutoTS/build/html/source/tutorial.html#models-1 and the warning in https://winedarksea.github.io/AutoTS/build/html/source/tutorial.html#model-lists-1 How do I exclude the two models considered "too slow"? Also is it possible to note on which ones are generally fast to do convergence or grokking?

Sep 26 '22 07:09 BradKML

It really depends, so it hard to give an exact answer. Some models are slow with lots of historical data, other models are slow with many multivariate series. It also depends on your computational resources. The models in 'parallel' scale linearly with number of CPU cores available, but will be slow if you have many series, but only a few cores. And a few, pytorch-forecasting and gluonts will be affected by available GPU resources (they run fine on CPU, gluonts is actually tfaster on a good CPU).

The 'superfast' model list is naive models, which with all the preprocessing can deliver pretty good models. I generally recommend testing with that. Try 'fast' next, and then 'fast_parallel'. And of course a custom list is always an option model_list = ['AverageValueNaive', 'SeasonalNaive', 'UnivariateMotif', 'ARCH'] etc, etc, etc

Sep 26 '22 14:09 winedarksea

AutoTS AutoTS copied to clipboard

Question: Multi-Objective Forecasting

AutoTS
AutoTS copied to clipboard