rnnoise
rnnoise copied to clipboard
my_crossentropy issue
Hi, I have a question with my_crossentropy loss. The loss is as follows. When I refer to the usage of K.binary_crossentropy, I find that the y_true should be the first place. So is there anyone could explain this issue? def my_crossentropy(y_true, y_pred): return K.mean(2*K.abs(y_true-0.5) * K.binary_crossentropy(y_pred, y_true), axis=-1)
useage from guide: keras.losses.binary_crossentropy(y_true, y_pred, from_logits=False, label_smoothing=0) tf.keras.backend.binary_crossentropy( target, output, from_logits=False)
I created a plot to show the difference. The blue and orange lines are showing the result of correct usage. The red and green ones are showing the RNNoise behaviour.
You're probably right, that this wasn't intended.
I created a plot to show the difference. The blue and orange lines are showing the result of correct usage. The red and green ones are showing the RNNoise behaviour.
You're probably right, that this wasn't intended.
Hi,Zadagu, I have one more question. I find that -1 can be assigned to label g vector. In this case, is it a right behavior in the loss mycost where K.sqrt(y_true) is caculated? It has risks that K.sqrt(-1) may be caculated. It should has a NAN issue, but when I perform training, it seems not NAN occurs.
The gain of -1 is set if there is no (or just low) signal energy whether in the clean speech nor in the noise. In that case it doesn't matter which gain is applied, because the signal isn't audible. There are also other cases where a gain of -1 is set, you can look them up in the denoise.c.
In the training the gains of -1 are ignored by multiplying the loss per gain with mymask(ground_truth_gains). For every gain in range between 0 and 1, mymask returns 1. For every -1 a 0 is returned. By multiplying this mask to the loss per gain, all gains of -1 are ignored.
The gain of -1 is set if there is no (or just low) signal energy whether in the clean speech nor in the noise. In that case it doesn't matter which gain is applied, because the signal isn't audible. There are also other cases where a gain of -1 is set, you can look them up in the denoise.c.
In the training the gains of -1 are ignored by multiplying the loss per gain with mymask(ground_truth_gains). For every gain in range between 0 and 1, mymask returns 1. For every -1 a 0 is returned. By multiplying this mask to the loss per gain, all gains of -1 are ignored.
Thank you , Zadagu. I find that in the rnn_data.c given in the repos, the activation in gru is relu while in the training model the activation function is tanh. Does that mean relu has a better performance? Meanwhile, may I know how many hours of training data do you use in your training and the best loss you can reach ?
These questions are already answered.
Tanh vs ReLU: https://github.com/xiph/rnnoise/issues/58 https://github.com/xiph/rnnoise/issues/79
For the amount of training data see: https://jmvalin.ca/papers/rnnoise_mmsp2018.pdf
These questions are already answered.
Tanh vs ReLU: #58 #79
For the amount of training data see: https://jmvalin.ca/papers/rnnoise_mmsp2018.pdf
Thanks. I checked the loss in train code in detail. It seems somewhat boost the vad loss when use binary_crossentropy in that way. And in my train test, the whole loss can hardly be smaller than 0.9. Is this a good loss? Or should I change the code of my_crossentropy?
I followed to change the my_crossentropy() to have the y_true as the first argument and y_pred as the second argument in def my_crossentropy(y_true, y_pred): return K.mean(2*K.abs(y_true-0.5) * K.binary_crossentropy(y_pred, y_true), axis=-1),
but then when i run my ./rnn_train.py script, I get all the loss as nan.. 2016/22500 [=>............................] - ETA: 20:30 - loss: nan - denoise_output_loss: nan - vad_output_loss: nan - denoise_output_msse: nan - vad_output_msse: nan
Any insights would be appreciated... what could be going wrong? But then if I have y_pred as the first argument to Keras binary entropy, I get to see the loss values as the iteration starts.. (higher with RELU and lower with TANH)... but tend to get nan values for every epoch..
I followed to change the my_crossentropy() to have the y_true as the first argument and y_pred as the second argument in def my_crossentropy(y_true, y_pred): return K.mean(2*K.abs(y_true-0.5) * K.binary_crossentropy(y_pred, y_true), axis=-1),
but then when i run my ./rnn_train.py script, I get all the loss as nan.. 2016/22500 [=>............................] - ETA: 20:30 - loss: nan - denoise_output_loss: nan - vad_output_loss: nan - denoise_output_msse: nan - vad_output_msse: nan
Any insights would be appreciated... what could be going wrong? But then if I have y_pred as the first argument to Keras binary entropy, I get to see the loss values as the iteration starts.. (higher with RELU and lower with TANH)... but tend to get nan values for every epoch..
yes, I also meet nan issue if I use relu but there is no nan when use tanh. I tried several methods(reduce lr, etc.), but it was not resolved. As fas as I know, the gradient suddenly goes to nan at any iteration, although the previous gradient is quite ok. The nan issue can come out at a random point, even the data is not shuffled. After a few tries, I changes tf to 2,1 version , there is nan even I use relu. So far , I still donot know the root cause.
The gain of -1 is set if there is no (or just low) signal energy whether in the clean speech nor in the noise. In that case it doesn't matter which gain is applied, because the signal isn't audible. There are also other cases where a gain of -1 is set, you can look them up in the denoise.c.
In the training the gains of -1 are ignored by multiplying the loss per gain with mymask(ground_truth_gains). For every gain in range between 0 and 1, mymask returns 1. For every -1 a 0 is returned. By multiplying this mask to the loss per gain, all gains of -1 are ignored.
Hi Zadagu, I still have a question about the gain. The output of model contains all gains of -1, but the activation function of denoise_output is sigmoid, which ranges from 0 to 1, that means the gain_pred can only be trained to (0,1) while the gain_true is actually -1, would it affect the denoised result?
Hi Zadagu, I still have a question about the gain. The output of model contains all gains of -1, but the activation function of denoise_output is sigmoid, which ranges from 0 to 1, that means the gain_pred can only be trained to (0,1) while the gain_true is actually -1, would it affect the denoised result?
No, it doesn't effect the result. The -1 is just a hack/flag to mark values which shouldn't influence the error.