Wave-U-Net-for-Speech-Enhancement icon indicating copy to clipboard operation
Wave-U-Net-for-Speech-Enhancement copied to clipboard

Assertion Error t len(mixture) == len(clean) == len(enhanced)

Open mnabihali opened this issue 5 years ago • 30 comments

when I am trying to run the code it gives me an error in this condition assert len(mixture) == len(clean) == len(enhanced) I printed the len of each and found the len of enhanced and clean is equal but len of the mixture is greater than both.

I hope you can help me as soon as possible

mnabihali avatar May 28 '20 20:05 mnabihali

Hello, have you solved your problem?I have the same problem with you.

Lerry123 avatar Jul 17 '20 08:07 Lerry123

No, but I remove this assertion and the required scores I calculated it manually (signal by signal comparison this will not raise an error) if you find another solution please tell me.

On Friday, July 17, 2020, Lerry123 [email protected] wrote:

Hello, have you solved your problem?I have the same problem with you.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-659956915, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBEALJCEKHEKVHUCQDDR4ADMJANCNFSM4NNNL46A .

-- Mohamed Nabih Ali *Assistant **Lecturer * *Faculty of Computers and IT * Egyptian E-Learning University Ain Shams Center *Mail: [email protected] [email protected] * Mobile: +201285659213

Work: 02-33318417

mnabihali avatar Jul 17 '20 12:07 mnabihali

Are you Chinese? We can chat on qq. My English is not good.

Lerry123 avatar Jul 18 '20 02:07 Lerry123

I find this code do the padding in the mixture,but the clean and enhancement don't do the padding. Which database do you use and how about the result?

Lerry123 avatar Jul 18 '20 02:07 Lerry123

Dear, Sorry I am not Chinese. Regarding the dataset I am using VCTK dataset and a noisy version of librispeech dataset.

Sorry Can I ask which part is doning padding for mixture?

Thanks

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-660410721, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBALDNQ6YC2AOEM5US3R4EC5ZANCNFSM4NNNL46A .

-- Mohamed Nabih Ali *Assistant **Lecturer * *Faculty of Computers and IT * Egyptian E-Learning University Ain Shams Center *Mail: [email protected] [email protected] * Mobile: +201285659213

Work: 02-33318417

mnabihali avatar Jul 18 '20 11:07 mnabihali

Do you find another solution for the problem?

On Saturday, July 18, 2020, Mohamed Nabih [email protected] wrote:

Dear, Sorry I am not Chinese. Regarding the dataset I am using VCTK dataset and a noisy version of librispeech dataset.

Sorry Can I ask which part is doning padding for mixture?

Thanks

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-660410721, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBALDNQ6YC2AOEM5US3R4EC5ZANCNFSM4NNNL46A .

-- Mohamed Nabih Ali *Assistant **Lecturer * *Faculty of Computers and IT * Egyptian E-Learning University Ain Shams Center *Mail: [email protected] [email protected] * Mobile: +201285659213

Work: 02-33318417

-- Mohamed Nabih Ali *Assistant **Lecturer * *Faculty of Computers and IT * Egyptian E-Learning University Ain Shams Center *Mail: [email protected] [email protected] * Mobile: +201285659213

Work: 02-33318417

mnabihali avatar Jul 18 '20 11:07 mnabihali

I am also using the VCTK database,I haven't found a solution yet.If I find the solution,I will tell you.

Lerry123 avatar Jul 19 '20 01:07 Lerry123

I have padded for the clean、enhanced and the mixture and this problem is solved,but I have the new problem. When computed STOI, it 's error.The detail is as follow. AttributeError: module 'numpy' has no attribute 'gcd' I haven't found the solution and I only computed the PESQ. # Metric #stoi_c_n.append(compute_STOI(clean, mixture, sr=16000)) #stoi_c_e.append(compute_STOI(clean, enhanced, sr=16000)) pesq_c_n.append(compute_PESQ(clean, mixture, sr=16000)) pesq_c_e.append(compute_PESQ(clean, enhanced, sr=16000)) Did you have the same problem?

Lerry123 avatar Jul 20 '20 08:07 Lerry123

Can you provide me the code how you padding the signals in order to try computing the Stoi

On Monday, July 20, 2020, Lerry123 [email protected] wrote:

I have padded for the clean、enhanced and the mixture and this problem is solved,but I have the new problem. When computed STOI, it 's error.The detail is as follow. AttributeError: module 'numpy' has no attribute 'gcd' I haven't found the solution and I only computed the PESQ.

Metric

#stoi_c_n.append(compute_STOI(clean, mixture, sr=16000)) #stoi_c_e.append(compute_STOI(clean, enhanced, sr=16000)) pesq_c_n.append(compute_PESQ(clean, mixture, sr=16000)) pesq_c_e.append(compute_PESQ(clean, enhanced, sr=16000)) Did you have the same problem?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-660870481, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBGK3Y57CFGKMS6C6F3R4P23FANCNFSM4NNNL46A .

-- Mohamed Nabih Ali *Assistant **Lecturer * *Faculty of Computers and IT * Egyptian E-Learning University Ain Shams Center *Mail: [email protected] [email protected] * Mobile: +201285659213

Work: 02-33318417

mnabihali avatar Jul 20 '20 11:07 mnabihali

@Lerry123 Can you tell me how you pad the signals, and I can try to solve the Numpy issue.

Thanks

mnabihali avatar Jul 20 '20 20:07 mnabihali

trainer.py:

for i, (mixture, clean, name) in enumerate(self.validation_data_loader): assert len(name) == 1, "Only support batch size is 1 in enhancement stage." name = name[0] padded_length = 0 #print("len(mixture):",len( mixture.cpu().numpy().reshape(-1))) mixture = mixture.to(self.device) # [1, 1, T] clean = clean.to(self.device) # The input of the model should be fixed length.

        if mixture.size(-1) % sample_length != 0:
            #print("mixture.size(-1):",mixture.size(-1))
            padded_length = sample_length - (mixture.size(-1) % sample_length)
            #print("padded_length:",padded_length)    
            mixture = torch.cat([mixture, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1)
            clean = torch.cat([clean, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1)
        #print("len(mixture):",len( mixture.cpu().numpy().reshape(-1)))
        #print("mixture.size(-1) % sample_length:",mixture.size(-1) % sample_length)
        #print("mixture.dim():",mixture.dim())
        assert mixture.size(-1) % sample_length == 0 and mixture.dim() == 3
        mixture_chunks = list(torch.split(mixture, sample_length, dim=-1))
        #print("mixture_chunks:",mixture_chunks)    
        enhanced_chunks = []
        for chunk in mixture_chunks:
            enhanced_chunks.append(self.model(chunk).detach().cpu())
        enhanced = torch.cat(enhanced_chunks, dim=-1)  # [1, 1, T]
        enhanced = enhanced.to(self.device)
        '''
        print("len(enhanced):",len(enhanced.cpu().reshape(-1).numpy()))
        print("padded_length:",padded_length)
        '''
        enhanced = enhanced 
        if padded_length == 0:
            enhanced = enhanced 
        else:
            
            enhanced = torch.cat([enhanced, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1)
            enhanced=enhanced[:, :, :-padded_length]
        #print("len(enhanced):",len(enhanced.cpu().reshape(-1).numpy())) 
        enhanced = enhanced.cpu().reshape(-1).numpy()
        clean = clean.cpu().numpy().reshape(-1)   
        mixture = mixture.cpu().numpy().reshape(-1)

Lerry123 avatar Jul 21 '20 01:07 Lerry123

Thanks

Regarding the AttributeError: module 'NumPy' has no attribute 'gcd'

It is available on Numpy version 1.15.0. So, check the NumPy version and back to me if not solved.

On Tue, Jul 21, 2020 at 3:18 AM Lerry123 [email protected] wrote:

trainer.py:

for i, (mixture, clean, name) in enumerate(self.validation_data_loader): assert len(name) == 1, "Only support batch size is 1 in enhancement stage." name = name[0] padded_length = 0 #print("len(mixture):",len( mixture.cpu().numpy().reshape(-1))) mixture = mixture.to(self.device) # [1, 1, T] clean = clean.to(self.device)

The input of the model should be fixed length.

    if mixture.size(-1) % sample_length != 0:
        #print("mixture.size(-1):",mixture.size(-1))
        padded_length = sample_length - (mixture.size(-1) % sample_length)
        #print("padded_length:",padded_length)
        mixture = torch.cat([mixture, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1)
        clean = torch.cat([clean, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1)
    #print("len(mixture):",len( mixture.cpu().numpy().reshape(-1)))
    #print("mixture.size(-1) % sample_length:",mixture.size(-1) % sample_length)
    #print("mixture.dim():",mixture.dim())
    assert mixture.size(-1) % sample_length == 0 and mixture.dim() == 3
    mixture_chunks = list(torch.split(mixture, sample_length, dim=-1))
    #print("mixture_chunks:",mixture_chunks)
    enhanced_chunks = []
    for chunk in mixture_chunks:
        enhanced_chunks.append(self.model(chunk).detach().cpu())
    enhanced = torch.cat(enhanced_chunks, dim=-1)  # [1, 1, T]
    enhanced = enhanced.to(self.device)
    '''
    print("len(enhanced):",len(enhanced.cpu().reshape(-1).numpy()))
    print("padded_length:",padded_length)
    '''
    enhanced = enhanced
    if padded_length == 0:
        enhanced = enhanced
    else:

        enhanced = torch.cat([enhanced, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1)
        enhanced=enhanced[:, :, :-padded_length]
    #print("len(enhanced):",len(enhanced.cpu().reshape(-1).numpy()))
    enhanced = enhanced.cpu().reshape(-1).numpy()
    clean = clean.cpu().numpy().reshape(-1)
    mixture = mixture.cpu().numpy().reshape(-1)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-661526116, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBGQPWLPFNFPN5V6QADR4TUEXANCNFSM4NNNL46A .

-- Mohamed Nabih Ali *Assistant **Lecturer * *Faculty of Computers and IT * Egyptian E-Learning University Ain Shams Center *Mail: [email protected] [email protected] * Mobile: +201285659213

Work: 02-33318417

mnabihali avatar Jul 21 '20 13:07 mnabihali

Thank you! It was solved by your suggestion.

Lerry123 avatar Jul 22 '20 08:07 Lerry123

Thanks Hope everything will be fine, can you provide me with your email to contact you for further problems

On Wednesday, July 22, 2020, Lerry123 [email protected] wrote:

Thank you! It was solved by your suggestion.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-662331904, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBEDD2QNFCRZMTGZSWTR42SQ5ANCNFSM4NNNL46A .

-- Mohamed Nabih Ali *Assistant **Lecturer * *Faculty of Computers and IT * Egyptian E-Learning University Ain Shams Center *Mail: [email protected] [email protected] * Mobile: +201285659213

Work: 02-33318417

mnabihali avatar Jul 22 '20 10:07 mnabihali

Can I get your email ? Have you done the test? My result is very confusing.

Lerry123 avatar Jul 26 '20 03:07 Lerry123

I think the problem is here: https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/blob/c8c9d8945959ba8c3aa1e7cb18cddc10dbc52210/trainer/trainer.py#L78

should be:

if padded_length != 0:                                                            
     enhanced = enhanced[:,:,:-padded_length]         
     mixture = mixture[:,:,:-padded_length] ```

diff7 avatar Aug 29 '20 12:08 diff7

That‘s the problem,as you said. When I use VCTK database,the test result is so bad and the speech is distorted.

Lerry123 avatar Aug 31 '20 02:08 Lerry123

@Lerry123 did you try on the same dataset? And what are advantages of training on VTCK?

I am training on the same dataset, 500 epochs so far and the quality is not great. PESQ is quite low, 1.75, STIO is 0.85 and yes, the sound is distorted but I will tweak some parameters, let's see if it gives a boost.

diff7 avatar Sep 09 '20 13:09 diff7

I have set the sr=16000 in waveform_dataset.py and waveform_dataset_enhancement.py, the PESQ is 2.63.The result is best.

waveform_dataset.py:

line65: mixture, _ = librosa.load(os.path.abspath(os.path.expanduser(mixture_path)), sr=16000) line 66: clean, _ = librosa.load(os.path.abspath(os.path.expanduser(clean_path)), sr=16000)

Lerry123 avatar Sep 10 '20 01:09 Lerry123

@Lerry123 got it, thanks! PESQ = 2.63, Is it with VTCK?

diff7 avatar Sep 11 '20 10:09 diff7

Yes

Lerry123 avatar Sep 20 '20 03:09 Lerry123

请问你是怎么更改的参数呀,我用原论文里面与SEGAN相同的数据集,训练出来的结果声音严重失真啊,呜呜呜,怎么回事能帮忙解答一下吗,感谢@Lerry123@diff7

meisanhai avatar Jan 01 '22 10:01 meisanhai

According to one answer this could be the solution

I think problem is here: https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/blob/c8c9d8945959ba8c3aa1e7cb18cddc10dbc52210/trainer/trainer.py#L77

should be:

if padded_length != 0: enhanced = enhanced[:,:,:-padded_length] mixture = mixture[:,:,:-padded_length]

On Sat, 1 Jan 2022 at 11:49 AM meisanhai @.***> wrote:

请问你是怎么更改的参数呀,我用原论文里面与SEGAN相同的数据集,训练出来的结果声音严重失真啊,呜呜呜,怎么回事能帮忙解答一下吗,感谢

— Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-1003540048, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBCF453AIL7GXDKZ2QDUT3L4JANCNFSM4NNNL46A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.*** com>

-- Mohamed Nabih Ali Assistant Lecturer * Faculty of Computers and IT * Egyptian E-Learning University Ain Shams Center Mail: @. @.*> * Mobile: +201285659213

Work: 02-33318417

mnabihali avatar Jan 01 '22 10:01 mnabihali

Sorry for the previous email, you could play with the parameters, and check. Thanks

On Sat, Jan 1, 2022 at 11:58 AM Mohamed Nabih @.***> wrote:

According to one answer this could be the solution

I think problem is here:

https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/blob/c8c9d8945959ba8c3aa1e7cb18cddc10dbc52210/trainer/trainer.py#L77

should be:

if padded_length != 0: enhanced = enhanced[:,:,:-padded_length] mixture = mixture[:,:,:-padded_length]

On Sat, 1 Jan 2022 at 11:49 AM meisanhai @.***> wrote:

请问你是怎么更改的参数呀,我用原论文里面与SEGAN相同的数据集,训练出来的结果声音严重失真啊,呜呜呜,怎么回事能帮忙解答一下吗,感谢

— Reply to this email directly, view it on GitHub https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/issues/7#issuecomment-1003540048, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWBGBCF453AIL7GXDKZ2QDUT3L4JANCNFSM4NNNL46A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.*** .com>

-- Mohamed Nabih Ali Assistant Lecturer * Faculty of Computers and IT * Egyptian E-Learning University Ain Shams Center Mail: @. @.*> * Mobile: +201285659213

Work: 02-33318417

-- Mohamed Nabih Ali Assistant Lecturer * Faculty of Computers and IT * Egyptian E-Learning University Ain Shams Center Mail: @. @.*> * Mobile: +201285659213

Work: 02-33318417

mnabihali avatar Jan 01 '22 11:01 mnabihali

非常感谢您的回复。我用https://datashare.ed.ac.uk/handle/10283/1942这个里面的数据集,训练的结果并不好,PESQ=1.35,STOI=0.65。请问您用的是什么数据集呢@mnabihali

meisanhai avatar Jan 01 '22 11:01 meisanhai

I applied it to my own dataset which is a noisy version of librispeeh dataset Sent from Mail for Windows From: meisanhaiSent: Saturday, January 1, 2022 12:23 PMTo: haoxiangsnr/Wave-U-Net-for-Speech-EnhancementCc: mnabihali; AuthorSubject: Re: [haoxiangsnr/Wave-U-Net-for-Speech-Enhancement] Assertion Error t len(mixture) == len(clean) == len(enhanced) ***@***.***—Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: ***@***.***> 

mnabihali avatar Jan 01 '22 11:01 mnabihali

mnabihali Thankyou verymuch!我把waveform_dataset.py里面的采样率改为16K,效果变好了(:-|

meisanhai avatar Jan 02 '22 01:01 meisanhai

我已经填充了清洁、增强和混合,这个问题解决了,但我有新问题。计算 STOI 时报错。详情如下。 AttributeError: module 'numpy' has no attribute 'gcd' 我没有找到解决方案,我只计算了 PESQ。 # Metric #stoi_c_n.append(compute_STOI(clean, mix, sr=16000)) #stoi_c_e.append(compute_STOI(clean, enhanced, sr=16000)) pesq_c_n.append(compute_PESQ(clean, mix, sr=16000)) pesq_c_e .append(compute_PESQ(clean, enhanced, sr=16000)) 你有同样的问题吗?

您好,请问一下是你怎么解决第十个epcho报错的问题的,我搞了好久没有解决

VaeFlashMe avatar Aug 29 '22 09:08 VaeFlashMe

I think the problem is here:

https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement/blob/c8c9d8945959ba8c3aa1e7cb18cddc10dbc52210/trainer/trainer.py#L78

should be:

if padded_length != 0:                                                            
     enhanced = enhanced[:,:,:-padded_length]         
     mixture = mixture[:,:,:-padded_length] ```

thank you so much

andyye1999 avatar Sep 23 '22 14:09 andyye1999

你是中国人吗?我们可以在qq上聊天。我的英语不好。

您好,我现在研二,想复现这个代码做一个创新点。复现中遇到了一些问题,请问您方便帮我看一下吗?可以加个qq交流下吗?

renxuezhang avatar Apr 01 '24 12:04 renxuezhang