ML-KWS-for-MCU icon indicating copy to clipboard operation
ML-KWS-for-MCU copied to clipboard

about training samples.

Open ccnankai opened this issue 5 years ago • 7 comments

In the actual application scenario, do the training samples need to use far-field, near-field, and different orientation sounds? If the audio clips have different durations at the beginning, do you need to normalize these audio clips?

ccnankai avatar Apr 19 '19 02:04 ccnankai

Hai @ccnankai Training samples can be a far-field, near-field and different orientation sounds, speech_commands_v0.02 dataset also has different orientation of sounds. In this tutorial they did mix the background noise and silence to the recordings, but afterwards they didn't do any audio clips normalization. may be you can do with and without normalization

saichand07 avatar Apr 23 '19 05:04 saichand07

@ccnankai how did you deal with negative bias_shift and out_shift, I have calculated bias_shift and out_shift of every layer and I did put those numbers in c code but my deployed model showing really bad output. do you have any idea, what i need to tune or check ?

saichand07 avatar Apr 23 '19 05:04 saichand07

Hi, @saichand07 Do you use your own data? I use my own data and the results are not good.

ccnankai avatar Apr 24 '19 02:04 ccnankai

@ccnankai How much data do you have? if it is very less, try to make your model small and train it. I also used my own data but size of my dataset is very small, therefore results were poor.

saichand07 avatar Apr 24 '19 06:04 saichand07

@ccnankai how did you deal with negative bias_shift and out_shift, I have calculated bias_shift and out_shift of every layer and I did put those numbers in c code but my deployed model showing really bad output. do you have any idea, what i need to tune or check ?

@ccnankai can you give me answer to this question?

saichand07 avatar Apr 24 '19 06:04 saichand07

@saichand07 I think your shift parameter is wrong. If it is correct, even if one or two bits are wrong, the result is acceptable.

ccnankai avatar Apr 24 '19 06:04 ccnankai

@ccnankai I calculated shift parameters as they explained in the tutorial, act_max = [32 8 8 8 8 8 8 8 8 8 8 8 ]

first input layer Q5.2 weights Q-2.9 (do we need to take -9 dec bits or 9) Bias Q-1.8 output Q3.4 (do we need to take last layer or next layers)

bias_shift_left = 3 out_shift right = 7

I have calculated same way for all layers

saichand07 avatar Apr 24 '19 07:04 saichand07