dsplog
dsplog
@janvainer : as seen from https://github.com/pytorch/pytorch/issues/38605, moved to torch==1.5.1 and the issue is not seen. anyhow, have to read up to understand what is going on.
faced the issue 1), and to workaround, did preprocessing to generate normalized spectrogram (both minmax normalized and standard normalized) and used for teacher and student model respectively.
to share an update. trained hifigan for the dataset, and the audio quality significantly improved.
> > I assume that the simple summation of the speaker embedding to the text encoding is not strong enough to preserve the speaker identity > > That might be...
broadly i can say yes. for few speakers, it gives a noticeable improvement. however, there are speakers still not getting captured well. it maybe due to not having sufficient data...
thanks much. it was straight forward to convert the code to python. created a pull request : [adding python function for number spell out](https://github.com/smc/mlmorph/pull/13) ps. it does not meet the...
btw, in line https://github.com/smc/mlmorph/blob/4641a5814a90b4e7b0dbea011f535218dd17f068/docs/components/Number.vue#L36 "hundredsa" a typo?
ah, thanks for sharing the details. will try this out.
> any help from those who speaks any language [in this list](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages) (or knows some linguistics about these languages) is greatly appreciated keen to extend this to [malayalam](https://en.wikipedia.org/wiki/Malayalam), dravidian language...
> I would recommend you create a new branch for this so I can merged and refer to people to this branch in README. I don't want to mess up...