echoprint-codegen icon indicating copy to clipboard operation
echoprint-codegen copied to clipboard

2 audio files of same song in different formats recognized differently

Open sreejithr opened this issue 9 years ago • 4 comments

These 2 are the same (Michael Jackson - Billie Jean). The 1st one is what was in my iPod and the 2nd one, I downloaded from Youtube and converted to mp3. I've attached both files.

File1: MPEG-1, Layer 3, 128 kbps File2: "mp4a.40.2" codecs="mp4a.40.2" converted to MPEG-1, Layer 3, 160 kbps (using ffmpeg)

I just can't figure out why the 1st file yields the song name correctly while the 2nd one returns none. The codes both are making seems to be different.

[1] File1 - https://www.dropbox.com/s/93p427ingdvfm7w/File1.mp3?dl=0 File2 - https://www.dropbox.com/s/ueb1ixy7ajj1w15/File2.mp3?dl=0

sreejithr avatar May 28 '15 06:05 sreejithr

I even tried looking at the differences between these 2 files in Audacity. Looks very alike. audacity_compare

sreejithr avatar May 28 '15 06:05 sreejithr

I read the paper on echoprint codes. Didn't help.

sreejithr avatar May 28 '15 06:05 sreejithr

The expected outcome was for both of these to match the song correctly to Michael Jackson?

They are visually not the same in the screen shot you provided. Similar in the context of this granularity, but you can see the differences even at the provided resolution.

Best guess at this point is that the conversions of the YouTube version to mp3 has left artifacts that make it difficult for codegen to handle.

Have you tried other conversion processes with the same results?

echoajohnson avatar May 28 '15 12:05 echoajohnson

@echoajohnson I've tried a couple different conversions with the same result. Is there more detailed documentation on the process echoprint is doing?

If echoprint is capable of detecting songs sung by a human through a microphone, could it be thrown off by such slight difference because of artifacts? It should be more resilient, right?

sreejithr avatar May 30 '15 15:05 sreejithr