Foundry icon indicating copy to clipboard operation
Foundry copied to clipboard

AbstractVectorThresholdMaximumGainLearner: Sanity check triggered

Open Zero3 opened this issue 10 years ago • 6 comments

I was playing around with the parameters for the Random Forest example from #6 and somehow triggered a sanity check in AbstractVectorThresholdMaximumGainLearner that probably should not be triggerable:

java.lang.RuntimeException: bestThreshold (8.30760652058587) lies outside range of values (8.30760652058587, 9.14680325466277]
    at gov.sandia.cognition.learning.algorithm.tree.AbstractVectorThresholdMaximumGainLearner.computeBestGainAndThreshold(AbstractVectorThresholdMaximumGainLearner.java:383)
    at gov.sandia.cognition.learning.algorithm.tree.AbstractVectorThresholdMaximumGainLearner.computeBestGainAndThreshold(AbstractVectorThresholdMaximumGainLearner.java:209)
    at gov.sandia.cognition.learning.algorithm.tree.AbstractVectorThresholdMaximumGainLearner.learn(AbstractVectorThresholdMaximumGainLearner.java:141)
    at gov.sandia.cognition.learning.algorithm.tree.AbstractVectorThresholdMaximumGainLearner.learn(AbstractVectorThresholdMaximumGainLearner.java:45)
    at gov.sandia.cognition.learning.algorithm.tree.RandomSubVectorThresholdLearner.learn(RandomSubVectorThresholdLearner.java:212)
    at gov.sandia.cognition.learning.algorithm.tree.RandomSubVectorThresholdLearner.learn(RandomSubVectorThresholdLearner.java:47)
    at gov.sandia.cognition.learning.algorithm.tree.CategorizationTreeLearner.learnNode(CategorizationTreeLearner.java:237)
    at gov.sandia.cognition.learning.algorithm.tree.CategorizationTreeLearner.learnNode(CategorizationTreeLearner.java:37)
    at gov.sandia.cognition.learning.algorithm.tree.AbstractDecisionTreeLearner.learnChildNodes(AbstractDecisionTreeLearner.java:129)
    at gov.sandia.cognition.learning.algorithm.tree.CategorizationTreeLearner.learnNode(CategorizationTreeLearner.java:246)
    at gov.sandia.cognition.learning.algorithm.tree.CategorizationTreeLearner.learn(CategorizationTreeLearner.java:178)
    at gov.sandia.cognition.learning.algorithm.tree.CategorizationTreeLearner.learn(CategorizationTreeLearner.java:37)
    at gov.sandia.cognition.learning.algorithm.ensemble.AbstractBaggingLearner.step(AbstractBaggingLearner.java:195)
    at gov.sandia.cognition.learning.algorithm.AbstractAnytimeBatchLearner.learn(AbstractAnytimeBatchLearner.java:147)
    ...

Zero3 avatar Jun 03 '15 01:06 Zero3

Do you have an example input for this?

jbasilico avatar Jun 03 '15 01:06 jbasilico

I can reproduce it by changing these two lines in the example:

int maxDepth = 10;
int minLeafSize = 10;

to:

int maxDepth = 5;
int minLeafSize = 5;

I use the input data I posted at https://gist.github.com/Zero3/55963dcf14c87e439668 which can be deserialized from a file using something like this:

try (ObjectInput input = new ObjectInputStream(new BufferedInputStream(new FileInputStream("algorithmfoundry-Foundry-issues-45.ser"))))
{
    Collection<InputOutputPair<Vector, String>> trainData = (Collection<InputOutputPair<Vector, String>>) input.readObject();
}
catch (IOException | ClassNotFoundException ex)
{
    throw new RuntimeException(ex);
}

(Note that my OutputType is String while the example uses Boolean)

Zero3 avatar Jun 03 '15 11:06 Zero3

(Please note that the test case above uses the same wrong parameter names as used in the example in #6)

Zero3 avatar Jun 05 '15 12:06 Zero3

I did some further testing with your new RandomForestFactory. I can consistently trigger the sanity check with minLeafSize = {2, 3, 4} when maxTreeDepth > 1.

Zero3 avatar Jun 14 '15 14:06 Zero3

Yes, I still need to look into this. Have you seen it happen when minLeafSize = 0?

jbasilico avatar Jun 15 '15 04:06 jbasilico

java.lang.IllegalArgumentException: minSplitSize must be positive (was 0).
    at gov.sandia.cognition.util.ArgumentChecker.assertIsPositive(ArgumentChecker.java:61)
    at gov.sandia.cognition.learning.algorithm.tree.AbstractVectorThresholdMaximumGainLearner.setMinSplitSize(AbstractVectorThresholdMaximumGainLearner.java:457)
    at gov.sandia.cognition.learning.algorithm.tree.AbstractVectorThresholdMaximumGainLearner.<init>(AbstractVectorThresholdMaximumGainLearner.java:86)
    at gov.sandia.cognition.learning.algorithm.tree.VectorThresholdInformationGainLearner.<init>(VectorThresholdInformationGainLearner.java:78)
    [...]

Zero3 avatar Jun 16 '15 13:06 Zero3