sof icon indicating copy to clipboard operation
sof copied to clipboard

[SKIP CI] Tools: Concepts: Draft of MFCC computing

Open singalsu opened this issue 3 years ago • 19 comments

Compute Mel Frequency Cepstral Coefficients (MFCC) from a SOF audio stream. The purpose of this Matlab or Octave script is to draft C implementation of a SOF MFCC generator component.

Signed-off-by: Seppo Ingalsuo [email protected]

singalsu avatar May 04 '22 09:05 singalsu

@singalsu There have been some significant changes to main(codec name), please rebase your branch before checking on CI again

wszypelt avatar May 04 '22 16:05 wszypelt

Some new library functions need:

  • linspace
  • window functions, tbd. (simplest need only trig. functions)
  • DCT, which type

We can use existing FFT

singalsu avatar May 05 '22 07:05 singalsu

SOFCI TEST Reason: Re-running builds because there was a timeout while running xtensa-build-zephyr.py.

greg-intel avatar May 12 '22 17:05 greg-intel

The MFCC output from Matlab/Octave version now fairly well matches pytorch except in since sweep higher frequency part. I'm still finding out where it happens:

Screenshot from 2022-05-17 16-56-26

singalsu avatar May 17 '22 14:05 singalsu

The purpose of this Matlab or Octave script is to draft C implementation of a SOF MFCC generator component.

Can you elaborate what you mean by "draft"? Run that octave code for a number of samples inputs and look at the outputs maybe?

This reference algorithm may change a bit when implementing SOF component if there's a better way to do something vs. current version. We haven't run any neural network with this data yet. If speech metrics differ vs. reference then some small difference seen now may be significant and needs to be addressed.

We will also add intermediate test vectors extract functions to this. Now only input and output is available in files.

In the future, would it be possible to run both the octave code and future C code and compare outputs as a unit test?

I'm not sure what would be a suitable location for Matlab concepts storing. I used here tools but it could be other too.

tests directory?

Yep that would make sense, after making the C component this would remain as reference that C is checked against.

singalsu avatar May 19 '22 15:05 singalsu

I just pushed new version that achieves good compatibility with Kaldi and Matlab, and fair compatibility with librosa. The delta-MFCC plots are below for chirp test signal. Librosa needs a new centered zero-pad option. Also STFT phase is different than in Kaldi.

Screenshot from 2022-05-19 18-07-50

singalsu avatar May 19 '22 15:05 singalsu

no indent.

Yes, I've written with both Emacs and Matlab and they use different indent styles, though I've already changed to use tab instead of default small 2 character indent, I'll fix those to be more like SOF C code for .m files so it will look more familiar.

singalsu avatar May 20 '22 08:05 singalsu

@singalsu how do we use the matlab/octave version and compare against the C/HiFi implementation using testbench ?

lgirdwood avatar May 25 '22 12:05 lgirdwood

@singalsu how do we use the matlab/octave version and compare against the C/HiFi implementation using testbench ?

Once the fixed fractional Q-formats are established I will add test vectors output to Matlab code. Then in testbench output the same intermediate data via traces likely and compare the results.

singalsu avatar May 25 '22 15:05 singalsu

@singalsu how do we use the matlab/octave version and compare against the C/HiFi implementation using testbench ?

Once the fixed fractional Q-formats are established I will add test vectors output to Matlab code. Then in testbench output the same intermediate data via traces likely and compare the results.

Automatically ? i.e. will the UT test script invoke the matlab and the testbench and compare ? This would then be easy to add into CI.

lgirdwood avatar May 25 '22 19:05 lgirdwood

Once the fixed fractional Q-formats are established I will add test vectors output to Matlab code. Then in testbench output the same intermediate data via traces likely and compare the results.

Automatically ? i.e. will the UT test script invoke the matlab and the testbench and compare ? This would then be easy to add into CI.

That would be a good target. The built-in data files are a burden to maintain. As long as the reference runs in Octave then it can be done. We don't have Matlab licenses for CI computers.

singalsu avatar May 30 '22 15:05 singalsu

The just pushed draft contains start of src/audio/mfcc component. It can be run in testbench with the test topology. It segments input data in component copy() for STFT with three possible window functions made initially (rectangular, Blackman, Povey). Next I will work with Mel spectrum conversion.

singalsu avatar Jun 03 '22 15:06 singalsu

Whats the plan for making this work for topology2 ?

lgirdwood avatar Jun 06 '22 15:06 lgirdwood

Whats the plan for making this work for topology2 ?

That would go to early Q3. Also I might be able to convert this to module adapter before vacation.

singalsu avatar Jun 15 '22 18:06 singalsu

I just pushed a version with lot of component C code added. It computes in testbench correct looking Mel spectrograms. This version is missing DCT for cepstral coefficients calculation. I will work with them next.

singalsu avatar Jun 15 '22 18:06 singalsu

I'm afraid you just triggered my copy/paste/diverge detector... can you add a couple variables and reduce duplication between test_mfcc_kaldi.py and test_mfcc_librosa.py?

Yes, they could be merged. The final form will depend on how I will do the unit and testbench tests for MFCC.

Also wondering whether this PR could be "divided and conquered" into several PRs? It's big...

Agree! I have now developed both Matlab and C parts the same time so the same git development branch has worked for me best. But I'd expect the Matlab part to stabilize quite soon so it can be separated.

singalsu avatar Jun 28 '22 12:06 singalsu

@marc-hb I've now split this work into two (or more) PRs. This remains the concept and reference code.

I wonder if this location /tools/concepts/<some_new_comp> is good. Since parts of this would be used for unit tests the location could be also e.g. something like /test/reference/audio/mfcc. The cmocka unit tests could call these functions from Octave to generate reference data to avoid make (new errors prone) floating point C functions of this. Any thoughts about this?

singalsu avatar Jun 29 '22 15:06 singalsu

so the same git development branch has worked for me best. But I'd expect the Matlab part to stabilize quite soon so it can be separated.

FWIW I submit multiple PRs from the same branch all the time. Example with two commits, one branch and two PRs.

git push myfork HEAD~1:refs/heads/newPR1 # or the equivalent from your editor's git plugin
git rebase -i # rotate commits
git push myfork HEAD~1:refs/heads/newPR2

Of course the risk is not testing commits in isolation but:

  • temporary git revert is your friend
  • you're supposed to know what you're doing
  • that's what CI is for :-)

Of course you need to use a very good git client to make that efficient, the git command line is too slow. I use magit.

I wonder if this location /tools/concepts/<some_new_comp> is good.

I cannot help here, sorry. I mostly stopped caring about directories; I only use search/find and "recent files". I generally have no clue in which directories are the files I'm working on.

Many projects have utterly meaningless top-level directories like scripts/, tools/ and utils/ which demonstrates this is a lost cause. Like Yahoo was :-)

Since parts of this would be used for unit tests the location could be also e.g. something like /test/reference/audio/mfcc.

Another fun fact: you'd expect low-level, CMocka unit tests to be located closed to the corresponding code they're testing (as opposed to higher-level tests). They're all isolated in a different directory. Go figure.

marc-hb avatar Jun 29 '22 22:06 marc-hb

Thanks for tips and thoughts @marc-hb !

singalsu avatar Jul 25 '22 14:07 singalsu

Can one of the admins verify this patch?

gkbldcig avatar Jan 17 '23 23:01 gkbldcig

@singalsu @andrula-song ping ?

lgirdwood avatar Jan 18 '23 14:01 lgirdwood