audio-compare icon indicating copy to clipboard operation
audio-compare copied to clipboard

Chromaprint + fpcalc + python + statistics = compare audio files and determine similarity

Simple tool to compare audio files

NOTE: I haven't written this, merely found it on the internet and ported to python 3.

  • https://shivama205.medium.com/audio-signals-comparison-23e431ed2207 "Audio signals: Comparison"
  • https://gist.github.com/shivama205/5578f999a9c88112f5d042ebb83e54d5 scripts from the article

Related projects:

  • https://acoustid.org/chromaprint fpcalc
  • numpy

Usage:

Sample files captured from a streaming source without exact start, duration but are the same song:

$ ./compare.py -i file1.mp3 -o file2.mp3
Calculating fingerprint by fpcalc for file1.mp3
Calculating fingerprint by fpcalc for file2.mp3
File A: file1.mp3
File B: file2.mp3
Match with correlation of 63.74% at offset 55

$ ./compare.py -i file2.mp3 -o file1.mp3
Calculating fingerprint by fpcalc for file2.mp3
Calculating fingerprint by fpcalc for file1.mp3
File A: file2.mp3
File B: file2.mp3
Match with correlation of 63.74% at offset -5

For some files the swapped order may not lead to the same results due to offset or the way the fpcalc fingerprint is generated (see help).

$ ./compare.py -i file2.mp3 -o file3.mp3
Calculating fingerprint by fpcalc for file2.mp3
Calculating fingerprint by fpcalc for file3.mp3
File A: file2.mp3
File B: file3.mp3
Match with correlation of 93.01% at offset -24

$ ./compare.py
Calculating fingerprint by fpcalc for file1.mp3
Calculating fingerprint by fpcalc for file3.mp3
File A: file1.mp3
File B: file3.mp3
Match with correlation of 63.96% at offset 31

Internally the fingerprint is generated by fpcalc -length 500, cached versions can be produced by fpcalc-gen.

Changes:

  • port to python3
  • print the similary as percents
  • print input files on separate lines
  • support precalculated fingerprint in file.mp3.fpcalc