SPORF icon indicating copy to clipboard operation
SPORF copied to clipboard

Remove redundant split directions before evaluating each of them

Open tyler-tomita opened this issue 6 years ago • 2 comments

The current sampling scheme just randomly places -1s and +1s in the projection matrix. Therefore it is possible to get redundant columns. Evaluating the same split directions multiple times is wasted computation that we can avoid.

tyler-tomita avatar Jun 12 '18 19:06 tyler-tomita

You are right about this and I'll relook at the feasibility of changing it. My concern is that this could be an often called and expensive check for something that has a very small probability of occurring and is of little consequence when it does. Before making a final decision on whether to include this, we should 1) look at creating the projection matrix in a way where this can't happen, 2) if we don't find a better way then we will have to implement the check and see how it affects training times, 3) we should check how often duplications like this actually occur in real world datasets.

jbrowne6 avatar Jun 12 '18 20:06 jbrowne6

When calculating feature importance this has been resolved, however that does not fix the issue upstream.

MrAE avatar Feb 06 '19 20:02 MrAE