tslearn
tslearn copied to clipboard
NAN loss for LearningShapelet with variable length time-series while training
Hi all, I am trying to train a LearningShapelet model with variable length time-series (refer to https://tslearn.readthedocs.io/en/latest/variablelength.html).
from tslearn.utils import to_time_series_dataset
from tslearn.shapelets import LearningShapelets
X = to_time_series_dataset([[1, 2, 3, 4], [1, 2, 3], [2, 5, 6, 7, 8, 9]])
y = [0, 0, 1]
clf = LearningShapelets(n_shapelets_per_size={3: 1}, verbose=1, max_iter=10)
clf.fit(X, y)
However, I find that the loss turns into 'nan' when training it. Any idea why is this happening? Thank you.
Epoch 1/10 1/1 [==============================] - 0s 372ms/step - loss: 0.8146 - binary_accuracy: 0.6667 - binary_crossentropy: 0.8146 Epoch 2/10 1/1 [==============================] - 0s 997us/step - loss: nan - binary_accuracy: 0.6667 - binary_crossentropy: nan Epoch 3/10 1/1 [==============================] - 0s 2ms/step - loss: nan - binary_accuracy: 0.6667 - binary_crossentropy: nan Epoch 4/10 1/1 [==============================] - 0s 2ms/step - loss: nan - binary_accuracy: 0.6667 - binary_crossentropy: nan Epoch 5/10 1/1 [==============================] - 0s 2ms/step - loss: nan - binary_accuracy: 0.6667 - binary_crossentropy: nan Epoch 6/10 1/1 [==============================] - 0s 997us/step - loss: nan - binary_accuracy: 0.6667 - binary_crossentropy: nan Epoch 7/10 1/1 [==============================] - 0s 997us/step - loss: nan - binary_accuracy: 0.6667 - binary_crossentropy: nan Epoch 8/10 1/1 [==============================] - 0s 2ms/step - loss: nan - binary_accuracy: 0.6667 - binary_crossentropy: nan Epoch 9/10 1/1 [==============================] - 0s 998us/step - loss: nan - binary_accuracy: 0.6667 - binary_crossentropy: nan Epoch 10/10 1/1 [==============================] - 0s 2ms/step - loss: nan - binary_accuracy: 0.6667 - binary_crossentropy: nan
Hello,
Have you tried normalizing your data? These nans can be caused by exploding/vanishing gradients.
Thanks for the prompt reply.
I try the TimeSeriesScalerMinMax(), but it does not work yet.
I wonder if this is caused by to_time_series_dataset, which makes all the time series have the same length by filling nan.
X = to_time_series_dataset([[1, 2, 3, 4], [1, 2, 3], [2, 5, 6, 7, 8, 9]])
After this, X = array([[[ 1.],[ 2.],[ 3.],[ 4.],[nan],[nan]], [[ 1.],[ 2.],[ 3.],[nan],[nan],[nan]],[[ 2.],[ 5.],[ 6.],[ 7.],[ 8.],[ 9.]]])
Hello,
Have you tried normalizing your data? These nans can be caused by exploding/vanishing gradients.
Hi @Wwwwei
I guess if the problem came from the padded NaN, it would occur at the first epochs, which is not the case, so exploding gradients are probably the cause of your problem.
Hi @Wwwwei
I guess if the problem came from the padded NaN, it would occur at the first epochs, which is not the case, so exploding gradients are probably the cause of your problem.
Hello @rtavenar, appreciate your response.
I try gradient clipping for your suggestion, but the bug remains.
When I change all the time series to a fixed length or fill the nan in to_time_series_dataset by zero , the model works.
Just like the simple demo in https://tslearn.readthedocs.io/en/latest/variablelength.html. Maybe something wrong when handling the variable length time-series?
Hi @GillesVandewiele @rtavenar Sorry to bother you again. I seem to have found the reason.
# source code in shapelets.py
class LocalSquaredDistanceLayer(Layer):
# ……
def call(self, x, **kwargs):
# (x - y)^2 = x^2 + y^2 - 2 * x * y
x_sq = K.expand_dims(K.sum(x ** 2, axis=2), axis=-1)
y_sq = K.reshape(K.sum(self.kernel ** 2, axis=1),
(1, 1, self.n_shapelets))
xy = K.dot(x, K.transpose(self.kernel))
return (x_sq + y_sq - 2 * xy) / K.int_shape(self.kernel)[1]
}
We take the derivative of y (i.e., our variable for shapelets), d(x-y)^2/dy = 2y-2x. So when there is nan in x (i.e., input data), the gradient will also be nan. That is why the first epoch works, but others do not (when first backpropagating, the gradient becomes nan). Is my understanding correct?
Hey @GillesVandewiele @rtavenar I was trying out using LearningShapelets on a variable length time series data and ran into this error too.. I got nans right from the first epoch. I tried using the TimeSeriesScalerMinMax() to normalize the data and it didn't make a difference.
I also went ahead and used the unit test for variable length and the results are nan from the second epoch. I added a normalizing and still the same.
# Test variable-length
y = [0, 1]
time_series = to_time_series_dataset([[1, 2, 3, 4, 5], [3, 2, 1]])
time_series = TimeSeriesScalerMinMax().fit_transform(time_series)
clf = LearningShapelets(n_shapelets_per_size={3: 1},
max_iter=5,
verbose=1,
random_state=0)
clf.fit(time_series, y)
Output -
Epoch 1/5
1/1 [==============================] - 1s 572ms/step - loss: 0.6930 - binary_accuracy: 0.5000 - binary_crossentropy: 0.6930
Epoch 2/5
1/1 [==============================] - 0s 5ms/step - loss: nan - binary_accuracy: 0.5000 - binary_crossentropy: nan
Epoch 3/5
1/1 [==============================] - 0s 5ms/step - loss: nan - binary_accuracy: 0.5000 - binary_crossentropy: nan
Epoch 4/5
1/1 [==============================] - 0s 5ms/step - loss: nan - binary_accuracy: 0.5000 - binary_crossentropy: nan
Epoch 5/5
1/1 [==============================] - 0s 5ms/step - loss: nan - binary_accuracy: 0.5000 - binary_crossentropy: nan
Do you think @Wwwwei 's potential reason above is reasonable? If in case the issue is different too, I think it must be addressed, even if it's only a note as to what are the possible causes for this. If I can get some pointers, I can also look further into it to rectify it and create a PR.