tensorflow-wavenet icon indicating copy to clipboard operation
tensorflow-wavenet copied to clipboard

Is the architecture computing a bunch of values that it will never use?

Open andimarafioti opened this issue 8 years ago • 3 comments

Hi! I've been trying to understand the network and there is this matter that I can't understand. I posted it as a question in stackexchange. Feel free to answer over there or here if you feel like you understand my question and would like to contribute. Thanks in advance!

andimarafioti avatar Nov 06 '17 12:11 andimarafioti

I am also trying to figure this out. My guess at the moment is that after the dilated stack you only need the last value in the sequence, as that is going to be the T+1. In the dilated stack they keep the same dimension as after a series of dilations they start again with full values. Have you figured it out already?

jbachh avatar Feb 13 '18 10:02 jbachh

Well if you look at the documentation in the code (https://github.com/ibab/tensorflow-wavenet/blob/e11ad19dccd6a33a182c9d1dea07aa53b9acca55/wavenet/model.py#L245) it does explicitly say that they use two different 1x1 convolutions. One goes to the skip connection and the other one to the next dilated layer. Then, not every layer would need the same amount of input and outputs and my original question is answered: The graph on the paper is (probably) misleading. And I'm saying probably because the implementation here is not an official one.

andimarafioti avatar Feb 13 '18 12:02 andimarafioti

So in: https://github.com/ibab/tensorflow-wavenet/blob/3c973c038c8c2c20fef0039f111cb04139ff594b/wavenet/model.py#L333-L336 we see that the input_batch is sliced prior to summing it with the transformed data. In this way, the residual connection carries only half of the values forward, the last half. Which is weird. Any ideas why this is done like that? I couldn't find it on the paper

andimarafioti avatar Jul 24 '18 07:07 andimarafioti