tsai icon indicating copy to clipboard operation
tsai copied to clipboard

TypeError: __init__() got an unexpected keyword argument 'custom_head' when using some Plus models

Open strakehyr opened this issue 3 years ago • 6 comments

Hi there, I'm running into TypeError: __init__() got an unexpected keyword argument 'custom_head' when running LSTM_FCNPlus and MLSTM_FCNPlus with multivariate time-series. AFAIK, according to #174 and #303, I should be able to use it for multivariate problems.

Relevant code being:

X, y = SlidingWindow(look_back, get_x = list(df_rolled.iloc[:,n_outputs:].columns), get_y = (df_rolled.iloc[:,n_outputs-1].name), horizon=horizon)(df_rolled)
y = y.reshape(y.shape[0], 1, y.shape[1])
splits = get_splits(y, n_splits=1, valid_size=1-train_ratio, shuffle=False)
check_data(X, y, splits)
tfms  = [None, [TSRegression()]]
batch_tfms = TSStandardize(by_var=True)
#device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
dls = get_ts_dls(X, y, splits=splits, tfms=tfms, batch_tfms=batch_tfms, device = 'cuda')

learn = ts_learner(dls, MLSTM_FCNPlus, rnn_layers = rnn_layers, hidden_size = hidden_size, kss = kss,  
                   bidirectional = bidirectional, conv_layers = conv_layers, fc_dropout = fc_dropout,
                   opt_func = SGD, loss_func = mse, 
                   metrics=[mae, mse], cbs=cbs)

I am still on version 0.2.24, but AFAIK that shouldn't be an issue.

strakehyr avatar Feb 23 '22 11:02 strakehyr

Hi @strakehyr, I'm not sure what the issue is, but it seems that it's related to data. I've tried a similar code with a multivariate time series and it works as expected:

X, y, splits = get_regression_data('AppliancesEnergy', split_data=False)
tfms  = [None, [TSRegression()]]
batch_tfms = TSStandardize(by_var=True)
dls = get_ts_dls(X, y, splits=splits, tfms=tfms, batch_tfms=batch_tfms)
learn = ts_learner(dls, MLSTM_FCNPlus, opt_func=SGD, loss_func=mse, metrics=[mae, mse])
learn.fit_one_cycle(1)

oguiza avatar Feb 23 '22 13:02 oguiza

Weird.. all my data is float64 and without NaNs or anything weird. On top of that, I have used this data with other models (TSTPlus for instance), and it works fine.

strakehyr avatar Feb 23 '22 13:02 strakehyr

While TSTPlus and other models support a y shape of [samples, no. of targets, horizon], MLSTM_FCNPlus and LSTM_FCNPlus apparently do not. Even for a y of shape [samples, horizon] (single target), I get the same error (my case in this occasion). Correct me I'm wrong, but apparently, these models cannot work on a y of different dimensionality than 1.

strakehyr avatar Feb 23 '22 14:02 strakehyr

Hi @strakehyr, I've just realized that your y.shape is 3d. This is indeed an issue with RNN_FCN models like (LSTM_FCNPlus and MLSTM_FCNPlus). Sorry about the misunderstanding. I've just fixed it so it should work. It'd be good if you can test it and confirm it works as expected.

oguiza avatar Feb 25 '22 10:02 oguiza

After attempting to run it again I get: RuntimeError: mat1 and mat2 shapes cannot be multiplied (64x328 and 125952x192) where 192 is my sequence length.

strakehyr avatar Mar 11 '22 14:03 strakehyr

I got the same error.

xinlnix avatar Aug 29 '22 14:08 xinlnix

Hi @strakehyr , @xinlnix, I'm sorry for the very late reply. I was reviewing opened bugs when discovered this long-standing one. If you use a 3d target, you'll need to pass a 3d custom head (the process to recognize the output is not automated). The process is quite convoluted, but it works. I'll simulate a 3d target in the following way:

X, y, splits = get_regression_data('AppliancesEnergy', split_data=False)
y = y.reshape(y.shape[0], 1, 1).repeat(2, 1).repeat(3, 2)
y.shape

output: (137, 2, 3) To do so, you should:

  1. learn what is the output size of the backbone:
tfms  = [None, [TSRegression()]]
batch_tfms = TSStandardize(by_var=True)
dls = get_ts_dls(X, y, splits=splits, tfms=tfms, batch_tfms=batch_tfms)
learn = ts_learner(dls, MLSTM_FCNPlus, opt_func=SGD, loss_func=MSELossFlat(), metrics=[mae, mse])
xb, yb = dls.train.one_batch()
learn.model.backbone(xb).shape

output: torch.Size([64, 228]) # 228 is the # you are looking for. 2. Create a custom head of the required shape. In this example, I've picked a lin_3d_head, but there are multiple types. Or you can build your own.

custom_head = lin_3d_head(228, dls.c, 1, dls.d)
tfms  = [None, [TSRegression()]]
batch_tfms = TSStandardize(by_var=True)
dls = get_ts_dls(X, y, splits=splits, tfms=tfms, batch_tfms=batch_tfms)
learn = ts_learner(dls, MLSTM_FCNPlus, custom_head=custom_head, opt_func=SGD, loss_func=MSELossFlat(), metrics=[mae, mse])
learn.fit_one_cycle(1)

If anybody tests this, please, let me know if it works.

oguiza avatar Dec 06 '22 10:12 oguiza