DiViMe icon indicating copy to clipboard operation
DiViMe copied to clipboard

[discussion] what to do with diversity of input formats?

Open alecristia opened this issue 7 years ago • 5 comments

... to be continued...

alecristia avatar Sep 18 '18 13:09 alecristia

use the sox tool in the VM to convert everything to WAV? use other scripts in tools/ or toolbox/ or varia/ to convert to RTTM? Just joining the discussion :)

riebling avatar Sep 18 '18 13:09 riebling

One question we had is that we know that some of the tools (looking at you noisemes) have different results wether the input has 1 or 2 chanels. Wouldn't it be important to benchmark all the tools (except tocombosad which doesn't work on 2 chanels) both on 1 and 2 chanels wav (we can use the CAS dataset which has wavs in stereo), so that we know on which we get the better results. If it turns out for example that mono wavs systematically give better results, we can force the conversion at the call of the tool (or even before). Maybe some of the tools also force the conversion internally...

The test could be done on several parameters of the input audio:

  • number of chanels (mono vs stereo - or more..)
  • sample rate ( 16k vs 44k )
  • encoding (mp3 vs wav vs other ...)

jukaradayi avatar Sep 19 '18 10:09 jukaradayi

To clarify, here's the todo list:

  • [ ] Check each tool: is it forcing a conversion unbeknownst to us? If so, what is the preferred input?
  • [ ] Check each tool: does it give different results for the 3 variables Julien mentioned (# channels, sample rate, encoding)?
  • [ ] If answer to the above is: no conversion, no difference, add layer that converts all input to the preferred input for a given tool

alecristia avatar Sep 25 '18 08:09 alecristia

This looks like a good task for the CMU student team, given the existing task to survey all the tools for other parameters: processing time, limits on input recording duration, memory consumption. That survey task may not be in the form of a GitHub Issue yet, was just something we agreed to work on by email.

riebling avatar Sep 25 '18 20:09 riebling

sounds good! However, I think this task is not a priority. It's just something we want to bear in mind in the future (ie an improvement) since our current user base uses standardized sound formats. So let's hold off assigning anyone until we have completed the other project.

alecristia avatar Sep 26 '18 06:09 alecristia