suzhenghang

Results 24 issues of suzhenghang

hello,as marked in your paper, the flops and params of eco is 64g & 47.5M,could you share the code how to get the results?

Hi @soeaver , I try to add the multi-scale traing, but the convergence seems to be difficult; Without multi-scale traing, converge quickly. Do you meet this situation? Thanks in advance

Hi @soeaver , After training the pspnet, could we input a random size image? As the the kernel_size of avepooling is 64, 32, 16 and 8 respectively; If I input...

听了[demo](https://neuralsvb.github.io/)后有些疑问, 1 如果实际使用来美化唱歌,那么Inference的时候是需要原唱的pitch curve对吧? 2 虽然测试样例不在训练样本中,但是GT Professional和GT Amateur是同一个人录制的。Inference中GT Professional不可能是自己,这样泛化性有测试过吗?

Hi, @ZFTurbo , I find the gpu memory during inference is so large (at least 2g, resnet50 backbone) Do you have any idea to save the gpu memory ? Thanks...

Hi @taoyang1122 , thanks for opening such good codes. What is the loss value at the end of training and could you share the loss curve?

Are there any differences in training a multi-speaker compared to single-speaker? After adding corresponding data and IDs to filelist.txt,more speakers require more epoch? could you give some suggestions, thanks in...

following-up

Because retrieval features are used during inference to replace input features in order to prevent speaker identity leakage, but how can we ensure that the generated speech still corresponds to...

transcriptions里面包含了很多标注信息(歌词文本、音素、音调、以及它们持续时间等)。如果我只有歌声和对应的歌词,其余的标注信息可以怎么生成啊?