kNN.jl
kNN.jl copied to clipboard
knn classifier accepts various data structures
Classifier can accept data structures form NearestNeighbors package.
I propose two new abstract types Classifier
and Regressor
to identify majority vote and averaging approach to results of particular prediction.
abstract Classifier
immutable kNNClassifier{T <: NearestNeighborTree} <: Classifier
t::T
y::Vector
end
Surely, Classifier
must be interface that would define a method specification which handles results of a particular model prediction through majority vote or as-is. Same approach would apply to Regressor
with averaging.
The community's spent a lot of time discussing how to define things like Regressor
in the past: https://github.com/JuliaStats/Roadmap.jl/issues/4
I'm not sure it's the most fruitful thing to do at the moment.
Coming back to this, perhaps we should introduce even finer type distinctions? One might want to use majority voting with a parametric type that stores the predictions in a vector of the same type as the input labels (which could be arbitrary types). In contrast, the averaging/probability case seems to call for a very different data structure as output.
I agree with parametric type for labels in classification case. As for regression case it would always be numerical value even though implementations could output auxiliary data.