Distances.jl icon indicating copy to clipboard operation
Distances.jl copied to clipboard

Confusion with (Sq)Mahalanobis Definition

Open btmit opened this issue 3 years ago • 1 comments

In this package the SqMahalanobis distance is implemented and documented as $(x - y)' * Q * (x - y)$. However, the documentation refers to $Q$ as the covariance matrix. Unless I'm mistaken, this is incorrect. In this formulation $Q$ is referred to as either the information matrix, precision matrix, or concentration matrix, which is equivalent to the inverse of the covariance matrix. The traditional square Mahalanobis equation reads $(x - y)' * Q^{-1} * (x - y)$.

I can verify this against the implementation in the Distributions package or a hand-coded version.

using Distributions, Distances
q = rand(3,3)
Q = q'q

x = rand(3)
y = rand(3)

d0 = (x-y)'*inv(Q)*(x-y)  # hand-coded
d1 = sqmahal(MvNormal(x, Q), y)  # Distributions.jl
d2 = sqmahalanobis(x, y, Q)
d3 = sqmahalanobis(x, y, inv(Q))

d0 ≈ d2  # false
d0 ≈ d1 ≈ d3  # true

Am I misunderstanding the intended use or documentation of this function?

btmit avatar Sep 15 '22 18:09 btmit

Judging by, e.g., https://en.wikipedia.org/wiki/Mahalanobis_distance, that seems to be correct. Could you perhaps submit a PR which clarifies the terminology?

dkarrasch avatar Jun 10 '24 20:06 dkarrasch