Spoken-Language-Identification
Spoken-Language-Identification copied to clipboard
Want .npy format file
english_mfcc = np.array([]).reshape(0, num_mfcc_features) for file in glob.glob(codePath + 'english/*.npy'):
at this line you used .npy what this mean can u share it and tell how u created it.
Hey, I just had to deal with this exact issue.
One journey of self-improvement later, I found out that .npy are numpy arrays.
I've constructed this code to convert files into it.
time_start_numpy = time.time()
for dir in glob.glob(codePath + '*'):
print("Creating numpy arrays for " + dir)
for file in glob.glob(dir+'/*.wav'):
fs, data = wavfile.read(file)
arr = data[0:num_mfcc_features]
np.save(file+".npy", arr)
time_end_numpy = time.time()
May I know who is this.
On Fri, Jun 12, 2020, 8:33 PM David [email protected] wrote:
Hey, I just had to deal with this exact issue.
One journey of self-improvement later, I found out that .npy are numpy arrays.
I've constructed this code to convert files into it.
` time_start_numpy = time.time() for dir in glob.glob(codePath + '
'): print("Creating numpy arrays for " + dir) for file in glob.glob(dir+'/.wav'): fs, data = wavfile.read(file)
arr = data[0:num_mfcc_features] np.save(file+".npy", arr)time_end_numpy = time.time()
`
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nipunmanral/Spoken-Language-Identification/issues/1#issuecomment-643321074, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE63ICIJYJYDELWP7J3EFTLRWI7S5ANCNFSM4KEUFWNQ .
A student doing his masters and heavily researching these subjects, haha. Who also just started using github.
Will submit a branch with this code later.
https://github.com/deibraz-free/Spoken-Language-Identification/ Uploaded from alt account.
@freemmaann Hello! They have set sequence_length = 1000.
My question is: Is it in milliseconds?
If it is in milliseconds, than isn't it supposed to be 10,000 for 10 seconds?
1 second = 1000 millisecond
Yeah, I found this odd as well, but apparently its 10 seconds.
Yeah it's confusing. So have you set it to 10,000 ? Can you tell what happens when at test time someone gives an audio smaller or greater then 10 seconds?! I mean have you experimented it?!
I just kept at 1000.
Don't know, the system was not really made for these things, so I wrote a tiny method to auto collect all files and split to precisely 10s segments.
@freemmaann @praveenssivam Sorry to bother you guys. My apology!
what was your data size? I am doing it for English and Japanese. My English data size is: 164 and Japanese 193. SO the num_english_sequence and num_japanese_sequence is always zero(0).
Because of this calculation: num_english_sequence = int(np.floor(len(english_mfcc)/sequence_length))
As sequence_length is 1000, num_english_sequence always ends up being zero.
As a result list_english_mfcc is always empty. And I end up having no X_train, Y_train etc..
Hi @Arafat4341, did you got the dataset or make it working?