echoprint-codegen icon indicating copy to clipboard operation
echoprint-codegen copied to clipboard

actual hash rate is about half what the Ellis-Whitman-Porter-11 paper states

Open adrianomitre opened this issue 9 years ago • 1 comments

In section 2 of the paper ECHOPRINT - AN OPEN MUSIC IDENTIFICATION SERVICE, it is stated that the "the overall hash rate is approximately 8 (bands) × 1 (onset per second) × 6 (hashes per onset) ≈ 48 hashes/sec". However, all the songs I have ran echoprint-codegen on have resulted in a much lower figure: always in the 23-28 hashes per second range, with an average slightly above 25. I am computing hash rate as H/L, there H is the total number of hashes produced for a song and L is the song length in seconds (which can be estimated as the maximum hash frame divided by the time quanta of the frame 11025/256 ≈ 43.07).

  • Is anyone having similar issues? Has anyone ever measured their code rates?
  • May the lower code rate hurt accuracy?
  • What parameters one should tweak to increase the code rate? I would assume it is related to the onset detection, thus in Fingerprint::adaptiveOnsets() method...

My fork of codegen which prints the hashes unhashed in [frame, band, delta1, delta2] JSON format is public and the following Ruby code computes the "hash" rate of the arguments:

#!/usr/bin/env ruby

require 'json'

def get_code(filename)
  JSON.parse(JSON.parse(File.read(filename))[0]["code"])
end

# Mean code rate in codes per second.
#
TimeQuantum = 11_025 / 256.0
def mean_code_rate(code)
  max_frame = code.map {|fr, b, d1, d2| fr }.max
  code.size / (max_frame / TimeQuanta)
end

ARGV.each do |filename|
  r = mean_code_rate(get_code(filename))
  puts "#{"%.2f" % r} ; #{filename}"
end

adrianomitre avatar Sep 16 '15 15:09 adrianomitre

It is stated, in section 2 of the paper, that "the overall hash rate is approximately [...] 48 hashes/sec". Then, in section 3, it is stated that "A 30 second query has about 800 hash keys." (800/30 = 26,6 hashes/sec). Only one of theses statement can be correct, and according to the results detailed in the previous comment, I would say it is the second.

adrianomitre avatar Oct 19 '15 07:10 adrianomitre