le huy comments

Results 7 comments of


le huy

Can we train with this yet?

Take a look at [this](https://github.com/keonlee9420/Parallel-Tacotron2/blob/67b2bb927b4a908431b966f6f75491dbbb9ce871/model/modules.py#L219): ```python speaker_embedding_m = speaker_embedding.unsqueeze(1).expand( -1, max_mel_len, -1 ) position_enc = self.position_enc[ :, :max_mel_len, : ].expand(batch_size, -1, -1) enc_input = torch.cat([position_enc, speaker_embedding_m, mel], dim=-1) ``` `speaker_embedding_m`...

Can we train with this yet?

> Hi @phamlehuy53 , `position_enc` also has `max_seq_len` in that dimension. But you notice that `speaking_embedding_m` and `mel` have `max_mel_len` instead, don't you?

Can we train with this yet?

> oh, sorry I mistyped. `position_enc` has `max_mel_len`, not `max_seq_len`. > > ```python > position_enc = self.position_enc[ > :, :max_mel_len, : > ].expand(batch_size, -1, -1) > ``` Yep, when `max_mel_len`...

AttributeError: 'StandardScaler' object has no attribute 'mean_'

You should make sure that this `self.process_utterance(speaker, basename)` worked. If not, no '_scaler' did `partial_fit()`

AttributeError: 'StandardScaler' object has no attribute 'mean_'

> @phamlehuy53 hi, thank you for the response. How can I ensure that self.process_utterance(speaker, basename) works? Take a look at this: ```python if os.path.exists(tg_path): ret = self.process_utterance(speaker, basename) if ret...

Symbols included punctuation

> Hi @phamlehuy53, in textgrid files generated by MFA tool, all punctuations will be modeled by `sp`. Of course. So model just learns `sp` not punctuation ( punctuation in `text.symbols`...

Symbols included punctuation

> I am not sure if I am understand this right, but if we use only {sp} for all punctuations, silence duration of "," and "." will be the same...