DongLu comments

Results 26 comments of


                                            DongLu

Overlay face on static image

Sure, you can use `renderTexture` to process only one frame. https://github.com/taylorlu/FaceConverter/blob/66831c1ecaf727d3508cb4e0a916921b032aa351/FaceConverter/ComViewController.mm#L375 Moreover, it's more convenient to use python to do so if you are familiar with [PRNet](https://github.com/YadiraF/PRNet).

Slow performance?

I'm afraid you should reimplement the uis-rnn by yourself if you need speed up, and the short-coming of the original uis-rnn also obviously since you have figure it out in...

Slow performance?

Sorry, I haven't test it on GPU. However, you can adjust the parameters in ghostvlad, such as `hop_length` to reduce the piece count of the whole wav file.

Slow performance?

I think there will be more complexity since you should compute the similarity of each speaker segmentation by different parallel threads. And the uis-rnn seems to require store the speaker...

Speaker diarization gives more than two speaker while i have only two speakers in the audio file.

1: uisrnn didn't support to cluster given size of speakers, limited to the design of the method. 2: It's sure that the model could finetune by new dataset, but unlike...

Speaker diarization gives more than two speaker while i have only two speakers in the audio file.

@vickianand Thanks for your idea, I haven't try spectral-clustering method. But the difference between spectral-clustering and uisrnn is that the former doesn't support realtime clustering which means you should input...

Speaker diarization gives more than two speaker while i have only two speakers in the audio file.

@vickianand, I used [openslr38](http://www.openslr.org/38) as the dataset of pretrained model, since my propose was to deal with Chinese dialogue.

Speaker diarization gives more than two speaker while i have only two speakers in the audio file.

@vickianand Yes, I also found there was a point in the shuffled permutation, and the uisrnn didn't support dynamic batch input too, the only way perhaps was to modify the...

Cuda Out Of Memory when invoking train.py

Well, please check if there is another process was in running and you forget to kill, this program should not occupy so much memory.

Innacurate start and till time of slices attained

The timings accurary also depends on the speaker feature, perhaps you need to find a more robust speaker feature extractor which can anti-noise, and also, change the sliding window size...