Fixing Tensor Shape/Dimension Mismatch Errors in TimeSeries Transformer for Stock Price Prediction
System Info
I am training a timeseries transformer model to predict stock price changes based on the previous day price and other parameters. I have encountered a series of errors related to tensor shape, dimensions, and size. Now I have this below error:
File "\transformers\models\time_series_transformer\modeling_time_series_transformer.py", line 1378, in forward transformer_inputs, loc, scale, static_feat = self.create_network_inputs( File "\transformers\models\time_series_transformer\modeling_time_series_transformer.py", line 1303, in create_network_inputs raise ValueError: input length 11 and time feature lengths 13 does not match
My sample input dataset has 10 rows and 70 features. So the numbers 11 and 13 is not clear.
Prior to this error, I had several errors that are similar to this. Pasting it here for reference.
modeling_time_series_transformer.py", line 1272, in create_network_inputs (torch.cat((past_values, future_values), dim=1) - loc) / scale RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 2 but got size 1 for tensor number 1 in the list. RuntimeError: Tensors must have same number of dimensions: got 3 and 2"
import pandas as pd
import numpy as np
from transformers import TimeSeriesTransformerModel, TimeSeriesTransformerConfig, Trainer, TrainingArguments, default_data_collator
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder
from sklearn.metrics import mean_squared_error
import torch
from torch.utils.data import Dataset
# Load the CSV file
file_path = './spy-stock-price - Spy_Ind_Signal.csv'
data = pd.read_csv(file_path)
# Exclude specified columns
exclude_columns = ['20SMA', '50SMA', '200SMA', '20EMA', '10EMA', 'MACD', 'MACD_Signal', 'Average_Volume', 'Bollinger_High', 'Bollinger_Low', 'Bollinger_Middle', 'VWAP', 'AVWAP']
data = data.drop(columns=exclude_columns)
# Preprocess the data
data['Date'] = pd.to_datetime(data['Date'])
data = data.sort_values('Date')
# Encode categorical signals
data_transformed = data
# Drop the 'Date' column
data_transformed = data_transformed.drop(columns=['Date'])
# Normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(data_transformed)
# Convert the data to a supervised learning problem
def create_dataset(data, look_back=1):
X, Y = [], []
for i in range(len(data) - look_back - 1):
a = data[i:(i + look_back)]
X.append(a)
Y.append(data[i + look_back, -2:]) # Include the last two columns as targets
if Y[-1] is None:
print(f"NoneType found at index {i + look_back}")
return np.array(X), np.array(Y)
look_back = 10 # Adjusted look_back to 10
X, y = create_dataset(scaled_data, look_back)
# Split into train and test sets
train_size = int(len(X) * 0.67)
X_train, X_test = X[0:train_size], X[train_size:]
y_train, y_test = y[0:train_size], y[train_size:]
# Reshape input to be [samples, time steps, features]
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], data_transformed.shape[1]))
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], data_transformed.shape[1]))
# Create observed mask for the transformer model
def create_observed_mask(data):
mask = np.ones_like(data, dtype=np.float32)
return mask
train_observed_mask = create_observed_mask(X_train)
test_observed_mask = create_observed_mask(X_test)
# Convert data to PyTorch tensors
X_train = torch.tensor(X_train, dtype=torch.float32)
X_test = torch.tensor(X_test, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.float32)
y_test = torch.tensor(y_test, dtype=torch.float32)
train_observed_mask = torch.tensor(train_observed_mask, dtype=torch.float32)
test_observed_mask = torch.tensor(test_observed_mask, dtype=torch.float32)
# Create a custom dataset class
class TimeSeriesDataset(Dataset):
def __init__(self, X, y, observed_mask):
self.X = X
self.y = y
self.observed_mask = observed_mask
def __len__(self):
return len(self.X)
def __getitem__(self, idx):
time_dim = self.X.shape[1]
feature_dim = self.X.shape[2]
future_values = self.y[idx].unsqueeze(0).repeat(time_dim, feature_dim // 2)
static_categorical_features = torch.tensor([]).unsqueeze(0).repeat(time_dim, feature_dim)
static_real_features = torch.zeros((time_dim, feature_dim))
static_feat = torch.zeros((time_dim, feature_dim, 1))
sample = {
'past_values': self.X[idx],
'past_time_features': torch.zeros((time_dim, feature_dim)),
'past_observed_mask': self.observed_mask[idx],
'future_values': future_values,
'future_time_features': future_values,
}
return sample
train_dataset = TimeSeriesDataset(X_train, y_train, train_observed_mask)
test_dataset = TimeSeriesDataset(X_test, y_test, test_observed_mask)
# Model configuration
config = TimeSeriesTransformerConfig(
prediction_length=1,
context_length=look_back,
lags_seq=[1, 2, 3],
input_size=data_transformed.shape[1],
output_size=2, # Predicting both price_change and bull_bear_signal
num_time_features=data_transformed.shape[1], # Match the input size
num_static_categorical_features=0,
num_static_real_features=0,
cardinality=[],
embedding_dimension=[]
)
model = TimeSeriesTransformerModel(config)
# Training configuration
training_args = TrainingArguments(
output_dir="./results",
eval_strategy="epoch",
learning_rate=1e-4,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
num_train_epochs=10,
weight_decay=0.01,
logging_dir="./logs",
)
# Training
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=test_dataset,
data_collator=default_data_collator,
)
trainer.train()
# Evaluation
predictions, labels, _ = trainer.predict(test_dataset)
# Inverse transform the predictions and labels
predictions = scaler.inverse_transform(predictions)
labels = scaler.inverse_transform(y_test.numpy())
# Separate the predictions and labels for price_change and bull_bear_signal
predictions_price_change = predictions[:, 0]
predictions_bull_bear_signal = predictions[:, 1]
labels_price_change = labels[:, 0]
labels_bull_bear_signal = labels[:, 1]
# Calculate the Mean Squared Error for price_change
mse_price_change = mean_squared_error(labels_price_change, predictions_price_change)
print(f"Mean Squared Error for Price Change: {mse_price_change}")
# For bull_bear_signal, we can use accuracy as the metric
accuracy_bull_bear_signal = np.mean(predictions_bull_bear_signal.round() == labels_bull_bear_signal.round())
print(f"Accuracy for Bull Bear Signal: {accuracy_bull_bear_signal}")
Steps taken so far:
Upgraded to the latest version of transformer model 4.42. Searched huggingface and stackoverflow forum to find a resolution. While each suggestion helps to address the current error, a slight variant of the same error pops up. Fixed several errors priors to the current error. Made sure all the tensors (past_values, past_time_features, future_values etc ) have the same shape. How to resolve the issue?
Who can help?
@ArthurZucker, @muellerzr @gante
Information
- [ ] The official example scripts
- [X] My own modified scripts
Tasks
- [ ] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [X] My own task or dataset (give details below)
Reproduction
- Import the required libraries
- create a csv file ‘./spy-stock-price - Spy_Ind_Signal.csv'
- fill csv with following columns (Date, Price, Volume, Price_change, bull_bear_signal.
- Execute the python script.
Expected behavior
The model should complete training and evaluation and provide accuracy of the predictions and mean square error.
cc @kashif
thanks! having a look
Appreciate it. Please let me know for any additional information that can help with the debugging.
If I understand correctly, the issue is in this code snippet:
sample = {
'past_values': self.X[idx],
'past_time_features': torch.zeros((time_dim, feature_dim)),
'past_observed_mask': self.observed_mask[idx],
'future_values': future_values,
'future_time_features': future_values,
}
Appreciate your timely support!
@sivavishnubramma indeed the issue is in the sizes of the inputs... so what are the shapes of "past_values" and "future_values" tensors?
also note that the future_time_features are the known covariates int he prediction window (i.e. the known date-time covariates for example or any static covariates)
I added a few time variant to the code. Here are the shapes of the parameter "
Sample 523 - past_values shape: torch.Size([10, 74]), past_time_features shape: torch.Size([10, 74]), past_observed_mask shape: torch.Size([10, 74]), future_values shape: torch.Size([10, 74]) Sample 186 - past_values shape: torch.Size([10, 74]), past_time_features shape: torch.Size([10, 74]), past_observed_mask shape: torch.Size([10, 74]), future_values shape: torch.Size([10, 74]) Sample 441 - past_values shape: torch.Size([10, 74]), past_time_features shape: torch.Size([10, 74]), past_observed_mask shape: torch.Size([10, 74]), future_values shape: torch.Size([10, 74])
fyi- Prior to adding the time variants, the shape was [10, 70]
The error messages are somewhat challenging for someone new to this area to decode. I have a more specific question regarding dataset construction for the Time Series Transformed Model training:
- I have a dataset from a CSV file with columns: {Date, Open, Close, Volume, Price_change, Bull_Bear_Signal}.
- I've added the following time-related features: {'day_of_week', 'month_of_year'}, increasing the total feature count to 8, including the Date.
- I then removed the Date column, leaving 7 features in the dataset.
- The look-back period is set to 10, and the prediction length is 1.
- The model outputs 2 predictions: {Price_change, Bull_Bear_Signal}.
Could you please guide me on how to correctly structure the following dataset to avoid previous errors and ensure successful model training? (OR) If there is a builder/adapter class that gets the user requirement similar to the above list and produces the dataset necessary to train the model, please let me know.
sample = {
'past_values': <>,
'past_time_features': <>,
'past_observed_mask': <>,
'future_values': <>,
'future_time_features': <>
}
Let me know for any additional details.
@sivavishnubramma we do have a blog post explaining the inputs and what the batch samples should be here: https://huggingface.co/blog/time-series-transformers
did you manage to have a read of the blog post?
Finally I was able to execute the script and run some evals. I got MASE around 2.3 and sMAPE around 0.16. Have to dig deeper to under and improve these numbers. Just want to share my appreciation to the huggingface team.
Later I will share some documentation that will help future users.
@kashif - if you have any blogpost that talks about time series model optimization, please share it.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.