Kyubyong Park
                                            Kyubyong Park
                                        
                                    @candlewill Did you find why the multi gpu version is slower than the single gpu one? For me, the former is definitely way faster than the latter.
I don't know, honestly. Does the original paper mention anything about it?
@msobhan69 Thanks. I believe what the paper said is true, but I don't know if it means Tacotron can generate samples real-time.
I apologise for this, guys. I don't know why the dropbox links stopped working. But, anyway I've created new links. Check them out.
Preprocessing might be a solution. Save inputs to disk as numpy arrays. Or adjust the hyperprams.
You guys are right. I've changed. Thanks.
You're right candlewill. But I don't see any particular reason why we should make things complicated, so I'll just change the output units of the second conv1d layer to 128...
@ggsonic Nice work! If you share training time or curve as well as your modified code, it would be appreciated. Plus, instance normalization instead of batch normalization... interesting. Is anyone...
@ggsonic Thanks! I guess you're right. I've changed the `reduce_frames` and adjust other relevant parts.
Technically speaking mel-scale is not exactly the same as log. See https://en.wikipedia.org/wiki/Mel_scale. The paper says they use melspectrogram and linear-scale log magnitude (spectrogram). So the `spectrogram2wav` converts the predicted magnitude...