Panako icon indicating copy to clipboard operation
Panako copied to clipboard

The Panako acoustic fingerprinting system.

!https://github.com/JorenSix/Panako/actions/workflows/gradle.yml/badge.svg(Panako build status)!

h1. Panako - Acoustic Fingerprinting

Panako is an "acoustic fingerprinting":https://en.wikipedia.org/wiki/Acoustic_fingerprint system. The system is able to extract fingerprints from an audio stream, and either store those fingerprints in a database, or find a match between the extracted fingerprints and stored fingerprints. Several acoustic fingerprinting algorithms are implemented within Panako. The main algorithm, the Panako algorithm, has the feature that audio queries can be identified reliably and quickly even if they has been sped up, time stretched or pitch shifted with respect to the reference audio. The main focus of Panako is to serve as a demonstration of the Panako algorithm. Other acoustic fingerprinting schemes are implemented to serve as a baseline for comparison. More information can be found in the "article about Panako":http://www.terasoft.com.tw/conf/ismir2014/proceedings/T048_122_Paper.pdf.

Please be aware of the patents "US7627477 B2":https://www.google.com/patents/US7627477 and "US6990453":https://www.google.com/patents/US6990453 and "perhaps others":http://www.shazam.com/music/web/productfeatures.html?id=1284. They describe techniques used in some algorithms implemented within Panako. The patents limit the use of some algorithms under various conditions and for several regions. Please make sure to consult your intellectual property rights specialist if you are in doubt about these restrictions. If these restrictions apply, respect the patent holders rights. The first aim of Panako is to serve as a research platform to experiment with and improve fingerprinting algorithms.

This document covers installation, usage and configuration of Panako.

The Panako source code is licensed under the "GNU Affero General Public License":https://www.gnu.org/licenses/agpl-3.0.html.

h2. Overview

"Why Panako?":#why

"Getting Started":#getting_started

"Usage":#usage

"Store Fingerprints":#store

"Query for Matches":#query

"Monitor Stream for Matches":#monitor

"Print Storage Statistics":#stats

"Print Configuration":#configuration

"Panako test run and output":#test

"Further reading":#read

"Panako and docker":#docker

"Evaluate Panako":#eval

"Contribute to Panako":#contribute

"Credits":#credits

"Changelog":#changelog

h2(#why). Why Panako?

Content based music search algorithms make it possible to identify a small audio fragment in a digital music archive with potentially millions of songs. Current search algorithms are able to respond quickly and reliably on an audio query, even if there is noise or other distortions present. During the last decades they have been used successfully as digital music archive management tools, music identification services for smartphones or for digital rights management.

!./resources/media/general_acoustic_fingerprinting_schema.svg(General content based audio search scheme)! Fig. General content based audio search scheme.

Most algorithms, as they are described in the literature, do not allow substantial changes in replay speed. The speed of the audio query needs to be the same as the reference audio for the current algorithms to work. This poses a problem, since changes in replay speed do occur commonly, they are either introduced by accident during an analog to digital conversion or are introduced deliberately.

Analogue physical media such as wax cylinders, wire recordings, magnetic tapes and grammophone records can be digitized at an incorrect or varying playback speed. Even when calibrated mechanical devices are used in a digitization process, the media could already have been recorded at an undesirable speed. To identify duplicates in a digitized archive, a music search algorithm should compensate for changes in replay speed

Next to accidental speed changes, deliberate speed manipulations are sometimes introduced during radio broadcasts: occasionally songs are played a bit faster to fit into a timeslot. During a DJ-set speed changes are almost always present. To correctly identify audio in these cases as well, a music search algorithm robust against pitch shifting, time stretching and speed changes is desired.

The Panako algorithm allows such changes while maintaining other desired features as scalability, robustness and reliability. Next to the panako algorithm there is also an implementation of the algorithm described in "An Industrial-Strength Audio Search Algorithm":http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf (internally identified as Olaf). Also the algorithm in "A Robust Audio Fingerprinter Based on Pitch Class Histograms - Applications for Ethnic Music Archives":http://0110.be/files/attachments/415/2012.01.20.fingerprinter.submitted.pdf is available. To make comparisons between fingerprinting systems easy, researchers are kindly invited to contribute algorithms to the Panako project.

Alternative open source music identification systems are "audfprint Matlab":http://www.ee.columbia.edu/~dpwe/resources/matlab/audfprint/, "audfprint Python":https://github.com/dpwe/audfprint, "NeuralFP":https://github.com/chymaera96/neuralfp and "echoprint":http://echoprint.me/. Alternative systems with similar features are described in "US7627477 B2":https://www.google.com/patents/US7627477 and in "Quad-based Audio Fingerprinting Robust To Time And Frequency Scaling (2014)":http://www.dafx14.fau.de/papers/dafx14_reinhard_sonnleitner_quad_based_audio_fingerpr.pdf by Reinhard Sonnleitner and Gerhard Widmer.

h2(#getting_started). Getting started

The Panako repository is organized as follows:

  • @src@ contains the Java source files.
  • @lib@ contains some dependencies (JAR-files).
  • @resources@ default configuration settings, script for evaluation and to analyze results and schemas to document Panako.

To compile Panako, a Java JDK 11 or later is required. See, for example the "OpenJDK website":https://openjdk.org/install/", for installation instructions for your operating system.

Panako uses the "Gradle":https://gradle.org/ build system. If Gradle is not present on your system it is automatically installed by the @gradlew@ script. To build and install Panako, the following should get you started:

bc. git clone --depth 1 https://github.com/JorenSix/Panako.git cd Panako ./gradlew shadowJar #Builds the core Panako library ./gradlew install #Installs Panako in the ~/.panako directory ./gradlew javadoc #Creates the documentation in build/doc #copy the panako startup script to your path: sudo cp resources/defaults/panako /usr/local/bin

The last command copies the startup script in @resources/defaults/panako@ to a directory in your path. The script allows for easy access to Panako's functionality. If this does not work for you, you can still call Panako using @java -jar ~/.panako/panako.jar [..args]@.

Panako decodes audio using by calling @ffmpeg@, an utility for decoding and resampling audio. The @ffmpeg@ command needs to be on the systems path. On a Debian like system:

bc. apt-get install ffmpeg

At this point it might be of interest to "run the test":#test to see if Panako is working correctly.

The current release does not support Apple M1 out of the box. Additional steps are needed to provide the Java lmdb bidge. This blogpost details "how to get java lmdb working on Apple M1":https://0110.be/posts/Using_Java_LMDB_on_Apple_Sillicon_or_other_unsupported_platforms.

h2(#usage). Panako Usage

Panako provides a command line interface, it can be called using @panako subapplication [argument...]@. For each task Panako has to perform, there is a subapplication. There are subapplications to store fingerprints, query audio fragments, monitor audio streams, and so on. Each subapplication has its own list of arguments, options, and output format that define the behavior of the subapplication. The screen capture below shows a typical interaction with Panako.

!./resources/media/panako_interactive_session.svg(A typical interaction with Panako via the command line interface)! Fig. A typical interaction with Panako via the command line interface. It shows how to store audio, perform a query and how to print database statistics.

To save some keystrokes the name of the subapplication can be shortened using a unique prefix. For example @panako m file.mp3@ is expanded to @panako monitor file.mp3@. Since both @stats@ and @store@ are valid subapplications the @store@ call can be shortened to @panako sto *.mp3@, @panako s *.mp3@ gives an invalid application message. A "trie":https://en.wikipedia.org/wiki/Trie is used to find a unique prefix.

What follows is a list of those subapplications, their arguments, and respective goal.

h3(#store). Store Fingerprints - @panako store@

The @store@ instruction extracts fingerprints from audio tracks and stores those in the datastorage. The command expects a list of audio files, video files or a text file with a list of file names.

bc.. #Audio is converted automatically panako store audio.mp3 audio.ogg audio.flac audio.wav

#The first audio stream of video containers formats are used. panako store audio.mpc audio.mp4 audio.acc audio.avi audio.ogv audio.wma

#Glob characters can be practical panako store /.mp3

#A text file with audio files can be used as well #The following searches for all mp3's (recursively) and #stores them in a text file find . -name '*.mp3' > list.txt #Iterate the list panako store list.txt

h3(#delete). Remove fingerprints - @panako delete@

This application removes fingerprints from the index. It essentially reverses the @store@ operation. The operation can be checked with @panako stats@

@panako delete test.mp3@

The default key-value-store is backed by some kind of B-tree structure. Removing many elements from such structure might leave the tree in an unbalanced state, which results in worse performance. I am not sure about the performance implications of deletion for LMDB but it might be of interest to either rebuild the index or avoid deletion as much as possible.

h3(#query). Query for Matches - @panako query@

The @query@ command extracts fingerprints from an audio frament and tries to match the fingerprints with the database.

bc. panako query short_audio.mp3

h3(#monitor). Monitor Stream for Matches - @panako monitor@

The @query@ command extracts fingerprints from a short audio frament and tries to match the fingerprints with the database. The incoming audio is, by default, chopped in parts of 25s with an overlap of 5s. So every @25-5=20s@ the database is queried and a result is printed to the command line.

bc. panako monitor [short_audio.mp3]

If no audio file is given, the default microphone is used as input.

h3(#stats). Print Storage Statistics - @panako stats@

The @stats@ command prints statistics about the stored fingerprints and the number of audio fragments. If an integer argument is given it keeps printing the stats every x seconds.

bc. panako stats # print stats once panako stats 20 # print stats every 20s

h3(#configuration). Print Configuration - @panako config@

The @config@ subapplication prints the configuration currently in use.

bc. panako config

To override configuration values there are two options. The first option is to create a configuration file, by default at the following location: @~./panako/config.properties@. The configuration file is a properties text file. An commented configuration file should be included in the doc directory at @doc/config.properties@.

The second option to override configuration values is by adding them to the arguments of the command line call as follows:

bc. panako subapplication CONFIG_KEY=value

For example, if you do not want to check for duplicate files while building a fingerprint database the following can be done:

bc. panako store file.mp3 CHECK_FOR_DUPLICATES=FALSE

The configuration values provided as a command line argument have priority over the ones in the configuration file. If there is no value configured a default is used automatically. To find out which configuration options are available and their respective functions, consult the documented example configuration file @doc/config.properties@..

h3(#resolve). Resolve an identifier for a filename - @panako resolve@

This application simply returns the identifier that is used internally for a filename. The following call returns for example @54657653@:

@panako resolve test.mp3@

The internal identifiers are currently defined using integers.

h3(#same). Prints how similar two audio files are - @panako same@

This application checks the similarity of two files. The percentage returned indicates the percentage of seconds for which fingerprints match between the first and second file. So 100% that matches are found in every second. A result of 30% still means that much of the audio matches.

@panako same first.mp3 second.mp3@

Note that this operation is performed in memory. Nothing changes on disk.

h2(#test). Panako test run and output

There is a script provided to quickly test Panako with a small dataset. Note that you might need a new shell environment to use @panako@. To download the dataset you also need SoX and wget on your system.

The test first downloads a set of songs from "Jamendo":https://www.jamendo.com/, a website with creative commons licensed music. Then it selects a random part of each downloaded song to use as a query. The query is modified by running an audio compressor followed by a random speed-up between 93% and 107%. Then Panako tries to match each query to the reference items.

bc. #brew install sox wget #apt-get install sox wget panako -v #prints version panako stats #db info cd resources/scripts/create_test_dataset ruby create_dataset.rb #make sure SoX is installed panako store STRATEGY=panako dataset/reference_items/* panako query STRATEGY=panako dataset/queries/*

The output after running the store command should be similar to the following. The last column represents how many seconds of audio can be processed in a single second. On the device that ran the command about 80 seconds are processed in a single second.

bc. Index;Audiofile;Audio duration;Processing time;Audio duration/processing time 1/5;1009601.wav;00:00:32;674.00 ms;47.12 2/5;406015.wav;00:03:21;2.53 s;79.50 3/5;557132.wav;00:04:24;3.25 s;81.18 4/5;566726.wav;00:03:17;2.43 s;81.10 5/5;946523.wav;00:04:39;3.55 s;78.5

After running the query command the following should appear.

bc. Index; Total ; Query path;Query start (s);Query stop (s); Match path;Match id; Match start (s); Match stop (s); Match score; Time factor (%); Frequency factor(%); Seconds with match (%) 1 ; 5 ; queries/1009601_17_100.wav ; 1.269 ; 12.925 ; reference_items/1009601.wav ; 1009601 ; 18.269 ; 29.925 ; 193 ; 1.001 % ; 1.000 %; 1.00 4 ; 5 ; queries/566726_129_101.wav ; 0.885 ; 8.381 ; reference_items/566726.wav ; 566726 ; 129.893 ; 137.437 ; 127 ; 1.006 % ; 1.016 %; 1.00 5 ; 5 ; queries/946523_208_103.wav ; 0.941 ; 11.909 ; reference_items/946523.wav ; 946523 ; 208.949 ; 220.245 ; 70 ; 1.031 % ; 1.016 %; 0.83 3 ; 5 ; queries/557132_97_104.wav ; 0.925 ; 12.405 ; reference_items/557132.wav ; 557132 ; 97.933 ; 109.869 ; 173 ; 1.040 % ; 1.042 %; 1.00 2 ; 5 ; queries/406015_182_97.wav ; 0.917 ; 9.109 ; reference_items/406015.wav ; 406015 ; 182.917 ; 190.853 ; 101 ; 0.972 % ; 0.976 %; 1.00

To interpret the results, it helps to explain the output:

  • Index The index in the list of queries, note that the results might be out of order due to the multi-threaded nature of Panako.
  • Total The total number of queries being handled.
  • Query path The path of the query file
  • Query start (s) The matching part of the query with the reference starts at this point in the query. It is expressed in seconds.
  • Query stop (s) The matching part of the query with the reference stops at this point in the query. It is expressed in seconds.
  • Match path The path of the matching file.
  • Match id The internal identifier of the matching file. If the filename contains only digits then it equals the file name (as is the case here). Otherwise it is a hash of the filename.
  • Match start (s) The matching part of the query with the reference starts at this point in the reference. It is expressed in seconds.
  • Match stop (s) The matching part of the query with the reference stops at this point in the reference. It is expressed in seconds.
  • Match score The number of matching fingerprints between query and reference. The number of fingerprints per second can be calculated by dividing by the match duration.
  • Time factor (%) How much faster the query is with respect to the reference.
  • Frequency factor(%) How much higher in pitch the query is with respect to the reference. If the query is sped up the time factor is expected to be nearly equal to the frequency factor.
  • Seconds with match (%) The number of seconds for which matches are found. If in each matching second a match is found, then this number is 1. A low ratio might indicate a false positive.

For the test queries the start of the match is encoded into the filename. Likewise for the speed-up factor and matching identifier. For example @queries/557132_97_104.wav@ expects to match the file with identifier @557132@ starting from second 97 at a speed up of 104%. When looking at the result this closely matches what is reported.

See below for a "full evaluation of Panako":#eval. The evaluation tests retrieval rates for various factors of pitch-shifting, time-stretching, speed-up and some audio effects.

h2(#docker). Panako and Docker

Panako can also be ran in a containerized environment. A @Dockerfile@ is provided which should both work on @x86_64@ as @aarch64@. To build the container and run commands the following should get you started. Note that the database with fingerprints is located on the host at @~/.panako/docker@:

To access audio files relative paths are given with respect to the current directory on the host. Watch out with absolute (host) paths, these will not work correctly.

bc. docker build -t panako:2.1 resources/scripts/ mkdir -p $HOME/.panako/docker wget "https://filesamples.com/samples/audio/mp3/sample3.mp3" #Store and test if it is added to the database docker run -v $HOME/.panako/docker:/root/.panako/dbs -v $PWD:/root/audio panako:2.1 panako store sample3.mp3 docker run -v $HOME/.panako/docker:/root/.panako/dbs panako:2.1 panako stats #change the algorithm to panako docker run -v $HOME/.panako/docker:/root/.panako/dbs -v $PWD:/root/audio panako:2.1 panako store STRATEGY=panako sample3.mp3 docker run -v $HOME/.panako/docker:/root/.panako/dbs -v $PWD:/root/audio panako:2.1 panako query STRATEGY=panako sample3.mp3

h2(#read). Further Reading

Some relevant reading material about acoustic fingerprinting. The order gives an idea of relevance to the Panako project.

Six, Joren and Leman, Marc "Panako - A Scalable Acoustic Fingerprinting System Handling Time-Scale and Pitch Modification":http://www.terasoft.com.tw/conf/ismir2014/proceedings/T048_122_Paper.pdf (2014)

Wang, Avery L. An Industrial-Strength Audio Search Algorithm (2003)

Cano, Pedro and Batlle, Eloi and Kalker, Ton and Haitsma, Jaap A Review of Audio Fingerprinting (2005)

Six, Joren and Cornelis, Olmo A Robust Audio Fingerprinter Based on Pitch Class Histograms - Applications for Ethnic Music Archives (2012)

Arzt, Andreas and Bock, Sebastian and Widmer, Gerhard Fast Identification of Piece and Score Position via Symbolic Fingerprinting (2012)

Fenet, Sebastien and Richard, Gael and Grenier, Yves A Scalable Audio Fingerprint Method with Robustness to Pitch-Shifting (2011)

Ellis, Dan and Whitman, Brian and Porter, Alastair Echoprint - An Open Music Identification Service (2011)

Sonnleitner, Reinhard and Widmer, Gerhard Quad-based Audio Fingerprinting Robust To Time And Frequency Scaling (2014)

Sonnleitner, Reinhard and Widmer, Gerhard "Robust Quad-Based Audio Fingerprinting":http://dx.doi.org/10.1109/TASLP.2015.2509248 (2015)

S. Chang et al., "Neural Audio Fingerprint for High-Specific Audio Retrieval Based on Contrastive Learning" (2021) - ICASSP 2021.

h2(#contribute). Contribute to Panako

There are several ways to contribute to Panako. The first is to use Panako and report issues or feature requests using the "Github Issue tracker":https://github.com/JorenSix/Panako/issues. Bug reports are greatly appreciated, but keep in mind the note below on responsiveness.

Another way to contribute is to dive into the code, fork and improve Panako yourself. Merge requests with additional documentation, bug fixes or new features will be handled and end up in the main branch if correctness, maintainability and simplicity are kept in check. However, keep in mind the note below:

My time to spend on Panako is limited and goes in activity bursts. If an issue is raised it might take a couple of months before I am able to spend time on it during the next burst of activity. A period of relative silence does not mean your feedback is not greatly valued!

h2(#credits). Credits

The Panako software was developed at "IPEM, Ghent University":http://www.ipem.ugent.be/ by Joren Six.

Some parts of Panako were inspired by the "Robust Landmark-Based Audio Fingerprinting Matlab implementation by Dan Ellis":http://www.ee.columbia.edu/~dpwe/resources/matlab/fingerprint/.

Panako includes parts or depends on the following projects:

  • "A trie by Marcus McCurdy":https://github.com/vivekn/autocomplete/blob/master/java/src/main/java/com/marcusmccurdy/trie/Trie.java - BSD licensed. For autocompletion.
  • "The Gaborator":https://gaborator.com/, 'a C++ library that generates constant-Q spectrograms for visualization and analysis of audio signals' - AGPLv3 licensed but with commercial licenses available!
  • "JGaborator":https://github.com/JorenSix/JGaborator a JNI bridge to The Gaborator by Joren Six - AGPL licensed.
  • "LMDB":https://www.symas.com/lmdb (The OpenLDAP Public License) and "lmdbjava":https://github.com/lmdbjava/lmdbjava (Apache License, Version 2.0). LMDB is used as key-value-store.

If you use Panako for research purposes, please cite the following works:

bc. @inproceedings{six2014panako, author = {Joren Six and Marc Leman}, title = {{Panako - A Scalable Acoustic Fingerprinting System Handling Time-Scale and Pitch Modification}}, booktitle = {{Proceedings of the 15th ISMIR Conference (ISMIR 2014)}}, year = 2014 }

bc. @inproceedings{six2021panako, title={Panako 2.0-Updates for an acoustic fingerprinting system}, author={Six, Joren}, booktitle={Demo / late-breaking abstracts of 22st International Society for Music Information Retrieval Conference, ISMIR}, year={2021} }

h2(#eval). Evaluate Panako

The directory @resources/scripts/evaluation@ contains scripts to reproduce the result found in the "Panako ISMIR 2021 abstract":https://archives.ismir.net/ismir2021/latebreaking/000039.pdf. The script follows this procedure:

The fingerprints are extracted and stored for each file in the data set.o download an openly available dataset

Query files are created from the Jamendo data set.

Panako is queried for each query file, results are logged

Results are parsed.

To run the evaluation scripts, a UNIX like system with following additional utilities is required. Evidently the Panako software should also be installed. Please see above to install Panako. Download "Free Music Archive":https://www.kaggle.com/datasets/imsparsh/fma-free-music-archive-small-medium

  • "Ruby":https://www.ruby-lang.org/en/: The ruby programming language runtime environment.
  • "SoX":http://sox.sourceforge.net/, including support for MP3 and GSM encoding.
  • GNUPlot: for plotting the results

If all requirements are met, running the evaluation is done using @ruby evaluation.rb fma_small@

h2(#limitations). Limitations

Panako has a couple of limitations:

  • Multiple writes the key-value store does not allow writes from multiple threads. These are effectively done on a single thread. To build a reference database quickly it is advised to use the cache feature of Panako.
  • Windows support Currently Panako does not support Windows. The key-value store used has spotty Windows support. The way in which audio is decoded using ffmpeg is also problematic on Windows. However, "WSL":https://docs.microsoft.com/en-us/windows/wsl/install offers a way in which to run Panako on Windows. 'WSL enables you to use Linux tools, like Bash or Grep [...], with no need to dual-boot.' With this option in mind, pure Windows support is not a priority. Another option to run Panako on windows is to use "Docker on Windows":https://docs.docker.com/desktop/windows/install/

h2(#changelog). Changelog

Version 1.5
2016-12-08
Improvements to SyncSink. The cross correlation now behaves well also in edge cases. The Sync code has been simplified to allow maintenance. Included unit tests. Updated to TarsosDSP version 2.4.
Version 1.6
2017-03-17
This release adds a simplified version of chromaprint and an implementation of "'A Highly Robust Audio Fingerprinting System' by Haitsma and Kalker":http://www.ismir2002.ismir.net/proceedings/02-FP04-2.pdf
Version 2.0
2021-04-27
A major overhaul of Panako. The main aim of this release is to ensure the longevity and maintainability of Panako. The featureset has been reduced to focus on core functionality. The version bump is also explained by the use of lambdas and the need for a newer JRE (8+). A list of changes: * The number of dependencies has been drastically cut by removing support for multiple key-value stores. * The key-value store has been changed to a faster and simpler system (from MapDB to LMDB). * The SyncSink functionality has been moved to another project (with Panako as dependency). * The main algorithms have been replaced with simpler and better working versions: ** Olaf is a new implementation of the classic Shazam algorithm. ** The algoritm described in the Panako paper was also replaced. The core ideas are still the same. The main change is the use of a "Gabor transform":https://en.wikipedia.org/wiki/Gabor_transform to go from time domain to the spectral domain (previously a constant-q transform was used). The gabor transform is implemented by "JGaborator":https://github.com/JorenSix/JGaborator which in turn relies on "The Gaborator":https://gaborator.com/ C++ library via JNI. * Folder structure has been simplified. * The UI, which was mainly used for debugging, has been removed. * A new set of helper scripts are added in the @scripts@ directory. They help with evaluation, parsing results, checking results, building panako, creating documentation,... * Changed the default panako location to ~/.panako, so users can install and use panako more easily (withouth need for sudo rights)
Version 2.1
2022-05
Changed the build system to "Gradle":https://gradle.org/, mainly to make upgrading dependencies more straightforward. The folder structure has been changed accordingly. The default JDK target has been changed to Java SE 17 (LTS). Panako now also supports Apple M1 targets.