pocketsphinx-ruby icon indicating copy to clipboard operation
pocketsphinx-ruby copied to clipboard

Normalize path_score as confidence score for utterance?

Open chinshr opened this issue 10 years ago • 4 comments

How can I normalize a path_score (like in Pocketsphinx::Decoder::Hypothesis) of an utterance to a relative confidence probability?

chinshr avatar May 05 '15 05:05 chinshr

You can not do that. Confidence is available in C api with ps_get_prob call which is not used in bindings.

nshmyrev avatar May 05 '15 17:05 nshmyrev

I've made it possible to get at this value using Pocketsphinx::Decoder::Hypothesis#posterior_prob. Does this help? Is there some additional normalization calculation that would be worth adding to the hypothesis?

@nshmyrev Does calling ps_get_prob every time I call ps_get_hyp have any negative performance implications?

watsonbox avatar May 12 '15 16:05 watsonbox

@watsonbox IMO a step in the right direction, but the posterior probability is logarithmic and needs to be converted to a decimal probability in order to get to a more meaningful confidence score, e.g. .81 = 81% confidence, etc. According to my investigation, I think an approach worth investigating is the following:

...
# Add inside Pocketsphinx::API::Pocketsphinx
typedef :pointer, :logmath
attach_function :ps_get_logmath, [:decoder], :logmath
attach_function :logmath_get_base, [:logmath], FFI::NativeType::FLOAT64
attach_function :logmath_exp, [:logmath, :int], FFI::NativeType::FLOAT64
...
# Pocketsphinx::Decoder
logmath = ps_api.ps_get_logmath(ps_decoder)
logbase = ps_api.logmath_get_base(logmath)
log_prob = ps_api.ps_get_prob(ps_decoder) # -> -9834
dec_prob = ps_api.logmath_exp(logmath, log_prob) # => 0.83111

Something similar needs to happen per word within an utterance:

# Inside Pocketsphinx::Decoder
def words
  ...
  acoustic_score = FFI::MemoryPointer.new(:int32, 1)
  language_score = FFI::MemoryPointer.new(:int32, 1)
  language_backoff = FFI::MemoryPointer.new(:int32, 1)

  ps_api.ps_seg_prob(seg_iter, acoustic_score, language_score, language_backoff)
  ...

Again, these scores are logarithmic and need to be converted before they are meaningful.

chinshr avatar May 13 '15 04:05 chinshr

Sorry for the delay! Please let me know if my commit resolves these issues.

watsonbox avatar Aug 10 '15 17:08 watsonbox