NBSS icon indicating copy to clipboard operation
NBSS copied to clipboard

The official repo of NBC & SpatialNet for multichannel speech separation, denoising, and dereverberation

Results 19 NBSS issues
Sort by recently updated
recently updated
newest added

你好,我用开源的SpatialNet和OSpatialNet分别训练了语音分离的模型,SpatialNet的表现确实非常惊艳,但是OSpatialNet出现训练时loss下降的比较正常,但是测试集中得到的结果非常差。猜测可能是出现过拟合的问题?

感谢您开源的优秀作品,有个问题想向您请教一下。从论文中看SpatialNet-small → SpatialNet-large 性能有比较大的提升,您是否尝试过更大参数量的SpatialNet?SpatialNet-large已经是上限了吗?

Dear Lee, Awesome job and congratulation! It seems that their is only multi-head self attention edition SpatialNet here. Will you release the online mamba edition in the future? Best!

I trained the model (form NBSS) branch for 2 speakers separation using wsj0 dataset. It perfectly worked. But now I want to train the model for more than 2 speakers....

I tried to train SpatialNet on WHAMR! dataset by the script `python SharedTrainer.py fit --config=configs/SpatialNet.yaml --config=configs/datasets/whamr.yaml --model.arch.dim_input=12 --model.arch.dim_output=4 --model.arch.num_freqs=129 --trainer.precision=bf16-mixed --model.compile=True --data.batch_size=[2,4] --trainer.devices=0,1,2,3, --trainer.max_epochs=100`, but I got an error: ```...

hi, it a great amazing project, thanks for your effort. When I looked at the code, I found that the training target signal was reverberated speech. (https://github.com/Audio-WestlakeU/NBSS/blob/af66db92bb9d6f72f7100d613d3df38c40b10b09/data_loaders/ss_semi_online_dataset.py#L294C27-L294C27) I wander why...

I am getting the value for loss as Nan And cuda error while training

This is an interesting project, and I am very interested. I am having trouble understanding how to effectively use custom dataset with SpatialNet. Can you guide me on: How to...

Hi, I recently received a trained checkpoint (ckpt) file from my colleague and attempted to test and run it on my own device. To ensure consistency, I used the same...

What can be done to utilize the algorithm for a non circular/random geometry of mic configurations?