Spoken-Language-Identification icon indicating copy to clipboard operation
Spoken-Language-Identification copied to clipboard

Want .npy format file

Open praveenssivam opened this issue 5 years ago • 10 comments

english_mfcc = np.array([]).reshape(0, num_mfcc_features) for file in glob.glob(codePath + 'english/*.npy'):

at this line you used .npy what this mean can u share it and tell how u created it.

praveenssivam avatar Jan 09 '20 08:01 praveenssivam

Hey, I just had to deal with this exact issue.

One journey of self-improvement later, I found out that .npy are numpy arrays.

I've constructed this code to convert files into it.

    time_start_numpy = time.time()
    for dir in glob.glob(codePath + '*'):
        print("Creating numpy arrays for " + dir)
        for file in glob.glob(dir+'/*.wav'):
            fs, data = wavfile.read(file)
            arr = data[0:num_mfcc_features]
            np.save(file+".npy", arr)
    time_end_numpy = time.time()

freemmaann avatar Jun 12 '20 15:06 freemmaann

May I know who is this.

On Fri, Jun 12, 2020, 8:33 PM David [email protected] wrote:

Hey, I just had to deal with this exact issue.

One journey of self-improvement later, I found out that .npy are numpy arrays.

I've constructed this code to convert files into it.

` time_start_numpy = time.time() for dir in glob.glob(codePath + '

'): print("Creating numpy arrays for " + dir) for file in glob.glob(dir+'/.wav'): fs, data = wavfile.read(file)

    arr = data[0:num_mfcc_features]
    np.save(file+".npy", arr)

time_end_numpy = time.time()

`

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nipunmanral/Spoken-Language-Identification/issues/1#issuecomment-643321074, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE63ICIJYJYDELWP7J3EFTLRWI7S5ANCNFSM4KEUFWNQ .

praveenssivam avatar Jun 12 '20 15:06 praveenssivam

A student doing his masters and heavily researching these subjects, haha. Who also just started using github.

Will submit a branch with this code later.

freemmaann avatar Jun 12 '20 15:06 freemmaann

https://github.com/deibraz-free/Spoken-Language-Identification/ Uploaded from alt account.

freemmaann avatar Jun 12 '20 15:06 freemmaann

@freemmaann Hello! They have set sequence_length = 1000. My question is: Is it in milliseconds? If it is in milliseconds, than isn't it supposed to be 10,000 for 10 seconds?

1 second = 1000 millisecond

Arafat4341 avatar Jul 03 '20 08:07 Arafat4341

Yeah, I found this odd as well, but apparently its 10 seconds.

freemmaann avatar Jul 03 '20 09:07 freemmaann

Yeah it's confusing. So have you set it to 10,000 ? Can you tell what happens when at test time someone gives an audio smaller or greater then 10 seconds?! I mean have you experimented it?!

Arafat4341 avatar Jul 03 '20 09:07 Arafat4341

I just kept at 1000.

Don't know, the system was not really made for these things, so I wrote a tiny method to auto collect all files and split to precisely 10s segments.

freemmaann avatar Jul 03 '20 12:07 freemmaann

@freemmaann @praveenssivam Sorry to bother you guys. My apology! what was your data size? I am doing it for English and Japanese. My English data size is: 164 and Japanese 193. SO the num_english_sequence and num_japanese_sequence is always zero(0). Because of this calculation: num_english_sequence = int(np.floor(len(english_mfcc)/sequence_length)) As sequence_length is 1000, num_english_sequence always ends up being zero. As a result list_english_mfcc is always empty. And I end up having no X_train, Y_train etc..

Arafat4341 avatar Jul 06 '20 08:07 Arafat4341

Hi @Arafat4341, did you got the dataset or make it working?

Aksh97 avatar Sep 06 '21 01:09 Aksh97