suzhenghang issues

Results 24 issues of


                                            suzhenghang

about the flops in your paper.

hello，as marked in your paper， the flops and params of eco is 64g & 47.5M，could you share the code how to get the results?

About the multi-scale traing

Hi @soeaver , I try to add the multi-scale traing, but the convergence seems to be difficult; Without multi-scale traing, converge quickly. Do you meet this situation? Thanks in advance

Question about the pspnet

Hi @soeaver , After training the pspnet, could we input a random size image? As the the kernel_size of avepooling is 64, 32, 16 and 8 respectively; If I input...

关于NSVB

听了[demo](https://neuralsvb.github.io/)后有些疑问， 1 如果实际使用来美化唱歌，那么Inference的时候是需要原唱的pitch curve对吧？ 2 虽然测试样例不在训练样本中，但是GT Professional和GT Amateur是同一个人录制的。Inference中GT Professional不可能是自己，这样泛化性有测试过吗？

GPU memory in inference

Hi, @ZFTurbo , I find the gpu memory during inference is so large (at least 2g, resnet50 backbone) Do you have any idea to save the gpu memory ? Thanks...

about the negative cosine proximity loss

Hi @taoyang1122 , thanks for opening such good codes. What is the loss value at the end of training and could you share the loss curve?

questions about training a multi-speaker model

Are there any differences in training a multi-speaker compared to single-speaker? After adding corresponding data and IDs to filelist.txt，more speakers require more epoch? could you give some suggestions, thanks in...

following-up

Regarding inputted speaker content

Because retrieval features are used during inference to replace input features in order to prevent speaker identity leakage, but how can we ensure that the generated speech still corresponds to...

关于Opencpop transcriptions.txt

transcriptions里面包含了很多标注信息（歌词文本、音素、音调、以及它们持续时间等）。如果我只有歌声和对应的歌词，其余的标注信息可以怎么生成啊？