chronos-forecasting icon indicating copy to clipboard operation
chronos-forecasting copied to clipboard

How to generate forecasts with `prediction_length > 64`?

Open clevilll opened this issue 1 year ago • 6 comments

Hi,

I have time data and split to train and test (keep it unseen) by slicing the df from the end part. I used your pipeline over data_train and tried to forecast as length as data_test unsuccessfully as below :

#-----------------------------------------------------------
# Libs
#-----------------------------------------------------------
# for plotting, run: pip install pandas matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import torch
from chronos import ChronosPipeline

#-----------------------------------------------------------
# LOAD THE DATASET
#-----------------------------------------------------------

df = pd.read_csv('https://raw.githubusercontent.com/amcs1729/Predicting-cloud-CPU-usage-on-Azure-data/master/azure.csv')
df['timestamp'] =  pd.to_datetime(df['timestamp'])
data = df.rename(columns={'min cpu': 'min_cpu',
                          'max cpu': 'max_cpu',
                          'avg cpu': 'avg_cpu',})



# Data preparation
# ==============================================================================
sliced_df = data[['timestamp', 'avg_cpu']]

# Convert data from Hz to MHz
# ==============================================================================
sliced_df['avg_cpu_Mhz'] = sliced_df['avg_cpu'] / 1000000
sliced_df

# Configuration
# ==============================================================================
name_columns='avg_cpu_Mhz'
lags=288
steps=288
n_backtest=3

step_size = steps * n_backtest
data_train = sliced_df[:-step_size]
data_test  = sliced_df[-step_size:] #unseen

# Pipeline
# ==============================================================================
pipeline = ChronosPipeline.from_pretrained(
    "amazon/chronos-t5-small",
    device_map="cuda",
    torch_dtype=torch.bfloat16,
)

# context must be either a 1D tensor, a list of 1D tensors,
# or a left-padded 2D tensor with batch as the first dimension
context = torch.tensor(data_train['avg_cpu_Mhz'])
prediction_length = 64 #len(data_test) #12

forecast = pipeline.predict(
    context,
    prediction_length,
    num_samples=288, #20,
    temperature=1.0,
    top_k=50,
    top_p=1.0,
) # forecast shape: [num_series, num_samples, prediction_length]

but results is as follow:

# visualize the forecast
forecast_index = range(len(data_train), len(data_train) + prediction_length)
low, median, high = np.quantile(forecast[0].numpy(), [0.1, 0.5, 0.9], axis=0)

plt.figure(figsize=(8, 4))
plt.plot(data_train['avg_cpu_Mhz'], color="royalblue", label="historical train data")
plt.plot(data_test['avg_cpu_Mhz'] , color="navy",      label="historical test data", linestyle='dashed')
plt.plot(forecast_index, median,    color="tomato",    label="median forecast")
plt.fill_between(forecast_index, low, high, color="tomato", alpha=0.3, label="80% prediction interval")

plt.title('Chronos forecast result')
plt.ylabel(' CPU usage [MHz]',   fontsize=15)
plt.xlabel('Timestamp', fontsize=15)
plt.legend()
plt.grid()
plt.show()

img

  • How I can configure the arguments within predict() class to have forecast autoregressive over unseen data_test ?
  • why prediction_length recommended to be <=64 ?

clevilll avatar Apr 03 '24 14:04 clevilll

  • You can set limit_prediction_length=False in predict(). See here: https://github.com/amazon-science/chronos-forecasting/blob/96cedec3fa9795c9bd58650080643e2b68bd3a6e/src/chronos/chronos.py#L388
  • The prediction_length is recommended to be <=64 because the models were trained for predictions upto 64. Unrolling the model beyond that may lead to suboptimal results.
  • Since it looks like you're using a high frequency time series (5min), there's another important point to note: the model only uses a context of the last 512 steps which may not be enough to correctly capture the seasonal patterns of a high freq. time series. We have discussed this briefly in the paper in Sec. 5.6 (Context Length).

abdulfatir avatar Apr 04 '24 12:04 abdulfatir

Alternatively, you can resample your dataset to a lower frequency. Here's an example with 1H:

#-----------------------------------------------------------
# Libs
#-----------------------------------------------------------
# for plotting, run: pip install pandas matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import torch
from chronos import ChronosPipeline

#-----------------------------------------------------------
# LOAD THE DATASET
#-----------------------------------------------------------

df = pd.read_csv('https://raw.githubusercontent.com/amcs1729/Predicting-cloud-CPU-usage-on-Azure-data/master/azure.csv')
df['timestamp'] =  pd.to_datetime(df['timestamp'])
data = df.rename(columns={'min cpu': 'min_cpu',
                          'max cpu': 'max_cpu',
                          'avg cpu': 'avg_cpu',})



# Data preparation
# ==============================================================================
sliced_df = data[['timestamp', 'avg_cpu']]

# Convert data from Hz to MHz
# ==============================================================================
sliced_df['avg_cpu_Mhz'] = sliced_df['avg_cpu'] / 1000000
sliced_df = sliced_df.set_index("timestamp").resample("1H").sum().reset_index()

# Configuration
# ==============================================================================
name_columns='avg_cpu_Mhz'
lags=24
steps=24
n_backtest=3

step_size = steps * n_backtest
data_train = sliced_df[:-step_size]
data_test  = sliced_df[-step_size:] #unseen

# Pipeline
# ==============================================================================
pipeline = ChronosPipeline.from_pretrained(
    "amazon/chronos-t5-small",
    device_map="cuda",
    torch_dtype=torch.bfloat16,
)

# context must be either a 1D tensor, a list of 1D tensors,
# or a left-padded 2D tensor with batch as the first dimension
context = torch.tensor(data_train['avg_cpu_Mhz'])
prediction_length = 72 #len(data_test) #12

forecast = pipeline.predict(
    context,
    prediction_length,
    num_samples=20, #20,
    temperature=1.0,
    top_k=50,
    top_p=1.0,
    limit_prediction_length=False
) # forecast shape: [num_series, num_samples, prediction_length]

# visualize the forecast
forecast_index = range(len(data_train), len(data_train) + prediction_length)
low, median, high = np.quantile(forecast[0].numpy(), [0.1, 0.5, 0.9], axis=0)

plt.figure(figsize=(8, 4))
plt.plot(data_train['avg_cpu_Mhz'], color="royalblue", label="historical train data")
plt.plot(data_test['avg_cpu_Mhz'] , color="navy",      label="historical test data", linestyle='dashed')
plt.plot(forecast_index, median,    color="tomato",    label="median forecast")
plt.fill_between(forecast_index, low, high, color="tomato", alpha=0.3, label="80% prediction interval")

plt.title('Chronos forecast result')
plt.ylabel(' CPU usage [MHz]',   fontsize=15)
plt.xlabel('Timestamp', fontsize=15)
plt.legend()
plt.grid()
plt.show()

Result: image

abdulfatir avatar Apr 04 '24 12:04 abdulfatir

@abdulfatir Thanks for your answer.

I have few Qs

  1. Can we conclude that one of the shortcomings of chronos is the length of prediction in out-of-sample forecasting over high-frequency time-series data?

Therefore to avoid suboptimal results we need ressample it. However, the fact that sometimes resampling with certain aggregation functions can damage the nature of time data. in this case, nature kept almost when you did df.set_index("timestamp").resample("1H").sum().reset_index() with aggregation function of sum().

  1. What is the best practice based on your learning if one needs to resample without (with minimum) damaging the geometry and nature of time data? (by resampling we lose information kind of)

clevilll avatar Apr 04 '24 22:04 clevilll

@clevilll sorry, missed this.

  1. Yes and No. It's not a limitation for the Chronos framework per se but the current Chronos models which were training to only look at a maximum context of 512 steps and forecast 64 steps into the future.
  2. Indeed subsampling will result in loss of information. Whether subsamlping is an acceptable option heavily depends on the use case and in my view there's no one-fits-all solution.

abdulfatir avatar Apr 23 '24 10:04 abdulfatir

May I ask if it's possible to test the performance with a context length of 2048, as described in Section 5.6 on context length in the original paper? @abdulfatir

GritLs avatar Oct 29 '24 06:10 GritLs

@GritLs Those experiments were ablations conducted with a different model which was trained with a longer context length. Unfortunately, that model is not available in public and we don't have immediate plans to release it. That said, you should be able to train a similar model yourself. Please follow the pretraining instructions and let me know if you have questions.

abdulfatir avatar Oct 29 '24 08:10 abdulfatir