pretrain-time-series-cloudops
pretrain-time-series-cloudops copied to clipboard
Official code repository for the paper "Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain"
Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain
Official code repository for the paper "Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain". Check out our paper for more details. Accompanying datasets can be found here.
Usage
Install the required packages.
Torch experiments:
pip install -r requirements/requirements-pytorch.txt
statsforecast experiments:
pip install -r requirements/requirements-stats.txt
Dataset
Easily load and access the dataset from Hugging Face Hub:
from datasets import load_dataset
ds = load_dataset(
"Salesforce/cloudops_tsf",
"azure_vm_traces_2017", # "borg_cluster_data_2011", "alibaba_cluster_trace_2018"
split=None, # "train_test", "pretrain"
)
Benchmark Experiments
We use Hydra for config management.
Deep Learning Models
Run the hyperparameter tuning script:
python -m benchmark.benchmark_exp model_name=MODEL_NAME dataset_name=DATASET
- where
MODEL_SIZEis one of:TemporalFusionTransformer,Autoformer,FEDformer,NSTransformer,PatchTST,LinearFamily,DeepTime,TimeGrad, orDeepVAR. DATASETis one ofazure_vm_traces_2017,borg_cluster_data_2011, oralibaba_cluster_trace_2018.
After hyperparameter tuning, run the test script:
python -m benchmark.benchmark_exp model_name=MODEL_NAME dataset_name=DATASET test=true
- where
MODEL_SIZEis one of:TemporalFusionTransformer,Autoformer,FEDformer,NSTransformer,PatchTST,LinearFamily,DeepTime,TimeGrad, orDeepVAR. DATASETis one ofazure_vm_traces_2017,borg_cluster_data_2011, oralibaba_cluster_trace_2018.- training logs and checkpoints will be saved in
outputs/benchmark_exp
Statistical Models
python -m benchmark.stats_exp DATASET --models MODEL_1 MODEL_2
DATASETis one ofazure_vm_traces_2017,borg_cluster_data_2011, oralibaba_cluster_trace_2018.MODEL_1,MODEL_2is a list of models you want to run, fromnaive,auto_arima,auto_ets,auto_theta,multivariate_naive, orvar.
Pre-training Experiments
Run the pre-training script:
python -m pretraining.pretrain_exp backbone=BACKBONE size=SIZE ++data.dataset_name=DATASET
- where the options for
BACKBONE,SIZEoptions can be found inconf/backboneandconf/sizerespectively. DATASETis one ofazure_vm_traces_2017,borg_cluster_data_2011, oralibaba_cluster_trace_2018.- see
confg/pretrain.yamlfor more details on the options. - training logs and checkpoints will be saved in
outputs/pretrain_exp
Run the forecast script:
python -m pretraining.forecast_exp backbone=BACKBONE forecast=FORECAST size=SIZE ++data.dataset_name=DATASET
- where the options for
BACKBONE,FORECAST,SIZEoptions can be found inconf/backbone,conf/forecast, andconf/sizerespectively. DATASETis one ofazure_vm_traces_2017,borg_cluster_data_2011, oralibaba_cluster_trace_2018.- see
confg/forecast.yamlfor more details on the options. - training logs and checkpoints will be saved in
outputs/forecast_exp
Citation
If you find the paper or the source code useful to your projects, please cite the following bibtex:
@article{woo2023pushing,
title={Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain},
author={Woo, Gerald and Liu, Chenghao and Kumar, Akshat and Sahoo, Doyen},
journal={arXiv preprint arXiv:2310.05063},
year={2023}
}