transformers
transformers copied to clipboard
Expanding static features when embedding - bug
System Info
Python 3.9, Pycharm
Who can help?
@sgugger @ArthurZucker and @younesbelkada
Information
- [ ] The official example scripts
- [X] My own modified scripts
Tasks
- [ ] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [X] My own task or dataset (give details below)
Reproduction
Here is the training script we used:
import torch
import pandas as pd
from torch.utils.data import Dataset, DataLoader
from transformers import TimeSeriesTransformerConfig, TimeSeriesTransformerModel, TimeSeriesTransformerForPrediction
from Preprocess import create_dataset
class TimeSeriesDataset(Dataset):
def __init__(self, subjects_dict):
self.subjects_dict = subjects_dict
self.subjects = list(subjects_dict.keys())
def __len__(self):
return len(self.subjects)
def __getitem__(self, idx):
subject = self.subjects[idx]
subject_dict = self.subjects_dict[subject]
# df_numpy = df.to_numpy()
# inputs = torch.tensor(df[['past_values', 'future_values']].values, dtype=torch.float32)
# inputs = torch.tensor()
return subject_dict
# Instantiating the dataset
directory = 'D:\Final Project\TASK_PCC_PFC\TEMP'
subjects_dict = create_dataset(directory)
dataset = TimeSeriesDataset(subjects_dict)
# Creating the dataloader
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)
# Instantiating the TimeSeriesTransformerForPrediction
# model = TimeSeriesTransformerForPrediction
embedding_dimension = [349]
cardinality = [15]#[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] #[15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15]#
# Initializing a default Time Series Transformer configuration
configuration = TimeSeriesTransformerConfig(prediction_length = 327, lags_sequence = [0, 0, 0], embedding_dimension = embedding_dimension,
num_static_categorical_features = 1, encoder_attention_heads = 2, decoder_attention_heads = 2, cardinality =cardinality )
# Randomly initializing a model (with random weights) from the configuration
model = TimeSeriesTransformerModel(configuration)
# Accessing the model configuration
configuration = model.config
#we dont know if passing the data as a dataframe instead if a tesndor would work
#currently model.train() is throwing an error, maybe we need to use a gpu? TODO
# Setting the model to training mode
model.train()
# Defining the loss function and optimizer
loss_fn = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
# Training loop
for epoch in range(100):
for batch in dataloader:
# Forward pass
outputs = model(
past_values=batch["past_values"],
past_time_features=batch["past_time_features"],
past_observed_mask=None,
static_categorical_features=batch['static_categorical_features'],
static_real_features=batch['static_real_features'],
future_values=batch["future_values"],
future_time_features=batch["future_time_features"],
)
loss = loss_fn(outputs, batch)
# Backward pass and optimization
optimizer.zero_grad()
loss.backward()
optimizer.step()
# Printing the training loss
if (epoch + 1) % 10 == 0:
print(f"Epoch [{epoch + 1}/100], Loss: {loss.item()}")
Dataset: HPC voxel dataset
Expected behavior
Hi,
We are trying to train TimeSeriesTransformer for forcasting using fMRI voxel data. The shape of the data is: (batch size, rows of datapoints, columns of features)
We encountered an issues in the embedding phase. This is from the source code:
# embeddings
embedded_cat = self.embedder(static_categorical_features)
# static features
log_scale = scale.log() if self.config.input_size == 1 else scale.squeeze(1).log()
static_feat = torch.cat((embedded_cat, static_real_features, log_scale), dim=1)
expanded_static_feat = static_feat.unsqueeze(1).expand(-1, time_feat.shape[1], -1)
This is the error:
Traceback (most recent call last):
File "D:\Final Project\fMRI_Ariel_Lital\train.py", line 61, in <module>
outputs = model(
File "C:\Users\Cognition\anaconda3\envs\ArielLital\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\Cognition\anaconda3\envs\ArielLital\lib\site-packages\transformers\models\time_series_transformer\modeling_time_series_transformer.py", line 1626, in forward
transformer_inputs, scale, static_feat = self.create_network_inputs(
File "C:\Users\Cognition\anaconda3\envs\ArielLital\lib\site-packages\transformers\models\time_series_transformer\modeling_time_series_transformer.py", line 1536, in create_network_inputs
expanded_static_feat = static_feat.unsqueeze(1).expand(-1, time_feat.shape[1], -1)
RuntimeError: expand(torch.DoubleTensor{[32, 1, 329, 349]}, size=[-1, 654, -1]): the number of sizes provided (3) must be greater or equal to the number of dimensions in the tensor (4)
Process finished with exit code 1
To our understanding there is a contradiction in this code.
embedded_cat has 3 dimentions: (batch_size, rows, columns)
log_scale has 3 dimenstions: (batch_size, 1, columns)
In order to use 'torch.cat', 'static_real_features' must have the shape: [batch_size, n, columns]
This means that aftre concatenation of these 3 variables, 'static_feat' will have 3 dimensions.
Then, when unsqueezing it will have 4 and then 'expand' won't work.
How can we solve this?
Many thanks!
cc @kashif
thanks @LtlSh for the report.
So embedding_dimension is the size of the resulting vector for the given categorical covariate. cardinality is the unique number of categories. So if you have only 15 different categories, perhaps it does not make sense to map the resulting vector to a 349 vector. Also the lags you can set it to [1] .
Finally, note the categorical feature is static meaning it has no temporal component and thus would have shape for a single feature [B, 1]
let me know if that makes sense?
Thank you for your answer! @kashif We tried your suggestion, but we are still getting the same error:
`
C:\Users\Cognition\anaconda3\envs\ArielLital\python.exe "D:\Final Project\fMRI_Ariel_Lital\train.py"
Traceback (most recent call last):
File "D:\Final Project\fMRI_Ariel_Lital\train.py", line 62, in
Process finished with exit code 1
`
In addition, we couldn't understand from your answer why there isn't a contradiction (I'm referring to this part of our previous comment: embedded_cat has 3 dimensions: (batch_size, rows, columns) log_scale has 3 dimensions: (batch_size, 1, columns) In order to use 'torch.cat', 'static_real_features' must have the shape: [batch_size, n, columns] This means that after concatenation of these 3 variables, 'static_feat' will have 3 dimensions. Then, when unsqueezing it will have 4 and then 'expand' won't work.)
Many thanks!!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.