Documentation for core/normalise could specify the type of normalization

Open shark8me opened this issue 8 years ago • 1 comments

The documentation for the core/normalise method could be more explicit, since there are multiple types of normalization used (in the context of machine learning). The current docstring says "Normalises a numerical vector (scales to unit length). Returns a new normalised vector.".

However there are multiple notions of normalization. Here are examples of normalization in other (similar) libraries:

One example is numpy norm, which has multiple ways of normalization.
Scikit-learn normalization provides options for L1 or L2 normalization (a subset of those provided by numpy.linalg.norm).
Feature scaling in machine learning is also called normalization.

Would you consider (for the sake of a PR) feature scaling methods such as those described here as being within the scope of core.matrix?

-- Thanks

Jul 04 '17 06:07 shark8me

Would probably need to be considered separately. Here's my initial thoughts:

Computing a L1 or L2 on Lx norm is in scope. These are common vector operations and benefit from potential implementation-specific optimisation (via core.matrix protocols)
Rescaling by means / minimum / maximum sounds a bit domain specific and is easy to do with a couple of core.matrix operations. Probably ot in scope.

Jul 04 '17 23:07 mikera