web-traffic-forecasting
web-traffic-forecasting copied to clipboard
how to understand seperate parameters handling the accumulating ?
WaveNet was trained using next step prediction, so errors can accumulate as the model generates long sequences in the absence of conditioning information. To remedy this, we trained the model to minimize the loss when unraveled for 64 steps. We adopt a sequence to sequence approach where the encoder and decoder do not share parameters. This allows the decoder to handle the accumulating noise when generating long sequences.
above said that using seperate parameters the accumulating noise will not be a big issue, basically the encoder part still accumulating the noise then transfer to the decoder part. I think I may miss something for better understanding the picture, can you please tell us more about it ?
I think that the author mainly focusing on the encoder-decoder framework, that encoder all the sequence then decode using the final step of encode ,which is conv_inputs with idx -->self.encode_len - dilation - 1 ; then decode one step by one; so that all the encoder sharing the parameters of gathering general time series features and the decoder part parameters focusing on the loss correctation part. that causing the parameters seperate reasonable also the noise accumulating dealt...