transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Fixing Tensor Shape/Dimension Mismatch Errors in TimeSeries Transformer for Stock Price Prediction

Open sivavishnubramma opened this issue 8 months ago • 10 comments

System Info

I am training a timeseries transformer model to predict stock price changes based on the previous day price and other parameters. I have encountered a series of errors related to tensor shape, dimensions, and size. Now I have this below error:

File "\transformers\models\time_series_transformer\modeling_time_series_transformer.py", line 1378, in forward transformer_inputs, loc, scale, static_feat = self.create_network_inputs( File "\transformers\models\time_series_transformer\modeling_time_series_transformer.py", line 1303, in create_network_inputs raise ValueError: input length 11 and time feature lengths 13 does not match

My sample input dataset has 10 rows and 70 features. So the numbers 11 and 13 is not clear.

Prior to this error, I had several errors that are similar to this. Pasting it here for reference.

modeling_time_series_transformer.py", line 1272, in create_network_inputs (torch.cat((past_values, future_values), dim=1) - loc) / scale RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 2 but got size 1 for tensor number 1 in the list. RuntimeError: Tensors must have same number of dimensions: got 3 and 2"

import pandas as pd
import numpy as np
from transformers import TimeSeriesTransformerModel, TimeSeriesTransformerConfig, Trainer, TrainingArguments, default_data_collator
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder
from sklearn.metrics import mean_squared_error
import torch
from torch.utils.data import Dataset

# Load the CSV file
file_path = './spy-stock-price - Spy_Ind_Signal.csv'
data = pd.read_csv(file_path)

# Exclude specified columns
exclude_columns = ['20SMA', '50SMA', '200SMA', '20EMA', '10EMA', 'MACD', 'MACD_Signal', 'Average_Volume', 'Bollinger_High', 'Bollinger_Low', 'Bollinger_Middle', 'VWAP', 'AVWAP']
data = data.drop(columns=exclude_columns)

# Preprocess the data
data['Date'] = pd.to_datetime(data['Date'])
data = data.sort_values('Date')

# Encode categorical signals
data_transformed = data

# Drop the 'Date' column
data_transformed = data_transformed.drop(columns=['Date'])

# Normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(data_transformed)

# Convert the data to a supervised learning problem
def create_dataset(data, look_back=1):
    X, Y = [], []
    for i in range(len(data) - look_back - 1):
        a = data[i:(i + look_back)]
        X.append(a)
        Y.append(data[i + look_back, -2:])  # Include the last two columns as targets
        if Y[-1] is None:
            print(f"NoneType found at index {i + look_back}")
    return np.array(X), np.array(Y)

look_back = 10  # Adjusted look_back to 10
X, y = create_dataset(scaled_data, look_back)

# Split into train and test sets
train_size = int(len(X) * 0.67)
X_train, X_test = X[0:train_size], X[train_size:]
y_train, y_test = y[0:train_size], y[train_size:]


# Reshape input to be [samples, time steps, features]
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], data_transformed.shape[1]))
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], data_transformed.shape[1]))

# Create observed mask for the transformer model
def create_observed_mask(data):
    mask = np.ones_like(data, dtype=np.float32)
    return mask

train_observed_mask = create_observed_mask(X_train)
test_observed_mask = create_observed_mask(X_test)

# Convert data to PyTorch tensors
X_train = torch.tensor(X_train, dtype=torch.float32)
X_test = torch.tensor(X_test, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.float32)
y_test = torch.tensor(y_test, dtype=torch.float32)
train_observed_mask = torch.tensor(train_observed_mask, dtype=torch.float32)
test_observed_mask = torch.tensor(test_observed_mask, dtype=torch.float32)

# Create a custom dataset class
class TimeSeriesDataset(Dataset):
    def __init__(self, X, y, observed_mask):
        self.X = X
        self.y = y
        self.observed_mask = observed_mask

    def __len__(self):
        return len(self.X)

    def __getitem__(self, idx):
        time_dim = self.X.shape[1]
        feature_dim = self.X.shape[2]

        future_values = self.y[idx].unsqueeze(0).repeat(time_dim, feature_dim // 2)
        static_categorical_features = torch.tensor([]).unsqueeze(0).repeat(time_dim, feature_dim)
        static_real_features = torch.zeros((time_dim, feature_dim))
        static_feat = torch.zeros((time_dim, feature_dim, 1))

        sample = {
            'past_values': self.X[idx],
            'past_time_features': torch.zeros((time_dim, feature_dim)), 
            'past_observed_mask': self.observed_mask[idx],
            'future_values': future_values,
            'future_time_features': future_values,
        }

        return sample

train_dataset = TimeSeriesDataset(X_train, y_train, train_observed_mask)
test_dataset = TimeSeriesDataset(X_test, y_test, test_observed_mask)

# Model configuration
config = TimeSeriesTransformerConfig(
    prediction_length=1,
    context_length=look_back,
    lags_seq=[1, 2, 3],
    input_size=data_transformed.shape[1],
    output_size=2,  # Predicting both price_change and bull_bear_signal
    num_time_features=data_transformed.shape[1],  # Match the input size
    num_static_categorical_features=0,
    num_static_real_features=0,
    cardinality=[],
    embedding_dimension=[]
)

model = TimeSeriesTransformerModel(config)

# Training configuration
training_args = TrainingArguments(
    output_dir="./results",
    eval_strategy="epoch",
    learning_rate=1e-4,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=10,
    weight_decay=0.01,
    logging_dir="./logs",
)

# Training
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    data_collator=default_data_collator,
)

trainer.train()

# Evaluation
predictions, labels, _ = trainer.predict(test_dataset)

# Inverse transform the predictions and labels
predictions = scaler.inverse_transform(predictions)
labels = scaler.inverse_transform(y_test.numpy())

# Separate the predictions and labels for price_change and bull_bear_signal
predictions_price_change = predictions[:, 0]
predictions_bull_bear_signal = predictions[:, 1]
labels_price_change = labels[:, 0]
labels_bull_bear_signal = labels[:, 1]

# Calculate the Mean Squared Error for price_change
mse_price_change = mean_squared_error(labels_price_change, predictions_price_change)
print(f"Mean Squared Error for Price Change: {mse_price_change}")

# For bull_bear_signal, we can use accuracy as the metric
accuracy_bull_bear_signal = np.mean(predictions_bull_bear_signal.round() == labels_bull_bear_signal.round())
print(f"Accuracy for Bull Bear Signal: {accuracy_bull_bear_signal}")

Steps taken so far:

Upgraded to the latest version of transformer model 4.42. Searched huggingface and stackoverflow forum to find a resolution. While each suggestion helps to address the current error, a slight variant of the same error pops up. Fixed several errors priors to the current error. Made sure all the tensors (past_values, past_time_features, future_values etc ) have the same shape. How to resolve the issue?

Who can help?

@ArthurZucker, @muellerzr @gante

Information

  • [ ] The official example scripts
  • [X] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [X] My own task or dataset (give details below)

Reproduction

  1. Import the required libraries
  2. create a csv file ‘./spy-stock-price - Spy_Ind_Signal.csv'
  3. fill csv with following columns (Date, Price, Volume, Price_change, bull_bear_signal.
  4. Execute the python script.

Expected behavior

The model should complete training and evaluation and provide accuracy of the predictions and mean square error.

sivavishnubramma avatar Jun 23 '24 06:06 sivavishnubramma