KernelFunctions.jl icon indicating copy to clipboard operation
KernelFunctions.jl copied to clipboard

DotProduct "metric"

Open willtebbutt opened this issue 4 years ago • 2 comments

The DotProduct is currently implemented as a PreMetric. This is unreasonable since it satisfies neither of the defining properties of a PreMetric.

willtebbutt avatar Apr 24 '20 22:04 willtebbutt

Well it is not even a distance, but does it really matter? If it's a problem we need to drop the dependency on Distances.jl and start using our own evaluate/pairwise functions.

theogf avatar Apr 25 '20 09:04 theogf

Well it is not even a distance, but does it really matter?

I would argue yes, because it's misleading. If a thing is called a metric, I expect it to at least satisfy how Distances.jl defines a metric. Remembering that what KernelFunctions calls a metric isn't what Distances.jl calls a metric, despite it's dependence on Distances.jl, is an inconsistency that existing contributors have to remember, and new contributors have to discover and come to terms with.

For example, it hadn't occurred to me that we might have defined a thing that subtypes PreMetric that doesn't satisfy the expected properties until I was considering implementing a more efficient thing to compute the diagonal of a kernel matrix for SimpleKernels, and realised that I couldn't rely on evaluate(d, x, x) == 0 in all cases.

If it's a problem we need to drop the dependency on Distances.jl and start using our own evaluate/pairwise functions.

We certainly don't need to drop the dependency, we could consider renaming some things though.

What we really mean by metric(k) is "give me the binary function b such that k(x, y) = kappa(k, evaluate(b, x, y)). Moreover, it needs to have an efficient implementation of pairwise(b, x_vec, y_vec). As discussed, while b is often at least a PreMetric in the Distances.jl sense, it doesn't have to be.

Proposal

  • rename the metric function to (the intentionally uninformative) binary_op -- given that there's not an obvious name, it's a good idea to make it uninformative and document it.
  • binary_op returns a function b that implements the evaluate / pairwise interface. For example, we would just make DotProduct not subtype PreMetric, but continue to implement the interface.
  • add docs stating the purpose of binary_op, and point out that it often returns a PreMetric but not always.

willtebbutt avatar Apr 25 '20 12:04 willtebbutt