Distances.jl icon indicating copy to clipboard operation
Distances.jl copied to clipboard

Consider making Minkowski distance a parametric type

Open ahwillia opened this issue 7 years ago • 5 comments

It would be nice to be able to dispatch on the p parameter in the Minkowski distance type. There is a very similar implementation here (thanks to @Evizero), which I think could be ported here if others are interested in this functionality: https://github.com/JuliaML/Losses.jl/blob/master/src/supervised/distance.jl#L10

ahwillia avatar Sep 06 '16 08:09 ahwillia

Currently it's defined as

immutable Minkowski{T <: Real} <: Metric
    p::T
end

So I think what you're after is already implemented?

ararslan avatar Oct 27 '16 22:10 ararslan

No I'm after something slightly different.

using LossFunctions

x = LPDistLoss(2) # same as L2DistLoss
y = LPDistLoss(1) # same as L1DistLoss

# multiple dispatch works for different norms!
foo(::LPDistLoss{1}) = ...
foo(::LPDistLoss{2}) = ...

The idea being that two Minkowski types with different values for p's are really different distance metrics and each should have its own type.

ahwillia avatar Oct 27 '16 23:10 ahwillia

I guess this would also fix the following issue:

using Distances, BenchmarkTools
x = rand(100); y = rand(100);
@btime evaluate($(Euclidean()), $x, $y) # 24.176 ns (0 allocations: 0 bytes)
@btime evaluate($(Minkowski(2)), $x, $y) # 1.983 μs (0 allocations: 0 bytes)

I think the reason for the HUUUGE difference is that the Minkowski parameter doesn't get compiled into the functions, but is evaluated at runtime. But for a fixed calculation, this is really not meant to change, and should be considered by the compiler as a constant.

dkarrasch avatar Nov 26 '18 12:11 dkarrasch

Yes, having the exponent as a type parameter seems like a good idea.

KristofferC avatar Nov 26 '18 16:11 KristofferC

I think that along the same lines, one could have a single interface for all UnionMetrics, including the weighted/periodic ones? By including the extra info as a type parameter, we could pass the current index i to the eval_op, and inside there get the dimension parameter (weight/period) with the help of the index i. Does that sound reasonable? See #107.

Edit: No, I just realized that one cannot have a vector as a "type", in contrast to numbers.

dkarrasch avatar Nov 26 '18 16:11 dkarrasch