tacotron icon indicating copy to clipboard operation
tacotron copied to clipboard

wrong size of conv1d in CBHG, Post-processing net

Open msobhan69 opened this issue 7 years ago • 4 comments

See https://github.com/Kyubyong/tacotron/blob/master/networks.py#L94 , according to Table 1 of the paper, the size of 2'th layer of Conv1D projections should be 80.

msobhan69 avatar May 29 '17 08:05 msobhan69

Here. (networks.py; L112~)

dec = conv1d(dec, hp.embed_size//2, 3, scope="conv1d_1") # (N, T', E) dec = normalize(dec, type="bn", is_training=is_training, activation_fn=tf.nn.relu) dec = conv1d(dec, hp.embed_size//2, 3, scope="conv1d_2") # (N, T', E) I think first method should be conv1d(dec, hp.embed_size, 3, scope="conv1d_1") and the second should be conv1d(dec, 80, 3, scope="conv1d_1")

tozangezan avatar Jun 06 '17 12:06 tozangezan

You guys are right. I've changed. Thanks.

Kyubyong avatar Jun 06 '17 13:06 Kyubyong

@Kyubyong The dimension is updated. However, this change leads to a new issue in highway net:

outputs = H * T + inputs * C

The dimension of inputs and C does not match. The inputs tensor has shape [N, T, W], W=80, while the C tensor has shape [N, T, num_units], num_units=128.

My suggestion is to add a dense layer before highway layer. This dense layer transforms the inputs into a tensor of shape [N, T, num_units].

candlewill avatar Jun 06 '17 16:06 candlewill

You're right candlewill. But I don't see any particular reason why we should make things complicated, so I'll just change the output units of the second conv1d layer to 128 for simpliicity.

Kyubyong avatar Jun 06 '17 16:06 Kyubyong