tacotron
tacotron copied to clipboard
wrong size of conv1d in CBHG, Post-processing net
See https://github.com/Kyubyong/tacotron/blob/master/networks.py#L94 , according to Table 1 of the paper, the size of 2'th layer of Conv1D projections should be 80.
Here. (networks.py; L112~)
dec = conv1d(dec, hp.embed_size//2, 3, scope="conv1d_1") # (N, T', E)
dec = normalize(dec, type="bn", is_training=is_training, activation_fn=tf.nn.relu)
dec = conv1d(dec, hp.embed_size//2, 3, scope="conv1d_2") # (N, T', E)
I think first method should be conv1d(dec, hp.embed_size, 3, scope="conv1d_1") and the second should be conv1d(dec, 80, 3, scope="conv1d_1")
You guys are right. I've changed. Thanks.
@Kyubyong The dimension is updated. However, this change leads to a new issue in highway net:
outputs = H * T + inputs * C
The dimension of inputs
and C
does not match. The inputs
tensor has shape [N, T, W], W=80, while the C
tensor has shape [N, T, num_units], num_units=128.
My suggestion is to add a dense layer before highway layer. This dense layer transforms the inputs
into a tensor of shape [N, T, num_units].
You're right candlewill. But I don't see any particular reason why we should make things complicated, so I'll just change the output units of the second conv1d layer to 128 for simpliicity.