tensorflow-wavenet icon indicating copy to clipboard operation
tensorflow-wavenet copied to clipboard

Dilated Causal Convolution is implemented differently from the paper

Open KyungsuKim42 opened this issue 8 years ago • 2 comments

According to the following visualization of dilated causal convolutional layers, I think strides in the conv1d (after time_to_batch) should be 2(equal to the dilation) so that there is only minimum number of nodes which can cover the receptive field. However, in this implementation, the stride is set to 2. I can somehow understand this can also work because anyway neural network will extract proper feature for the generative model. What I want to know is whether it is intended or not. If so, what is the reason that it is modified?

image

KyungsuKim42 avatar Aug 14 '17 04:08 KyungsuKim42

I'm not sure I follow you. You think the strides should be 2 but in the implementation it is set to 2?

andimarafioti avatar Sep 28 '17 11:09 andimarafioti

@godelicbach The strides should be 1 for conv1d (after time_to_batch). Because after time_to_batch reshape, data is arranged to dilated form for conv1d. Check following numpy example code for doing the same reshape as time_to_batch (ignore the channel dimension):

dilation = 2, batch = 1, value_len = 8 arr = np.arange(batch * value_len).reshape((batch, value_len)) arr: [[0, 1, 2, 3, 4, 5, 6, 7]]

arr_r = arr.reshape([-1, dilation]) arr_r: [[0, 1], [2, 3], [4, 5], [6, 7]]

arr_t = arr1.transpose() arr_t: [[0, 2, 4, 6], [1, 3, 5, 7]]

time_to_batch_output = arr_t.reshape([dilation * batch, -1]) time_to_batch_output: [[0, 2, 4, 6], [1, 3, 5, 7]]

blues-lin avatar Feb 07 '18 01:02 blues-lin