Unexpected Meka Evaluation Result
Hi, The result of my evaluation is zero and I don't know why? my code is here: try { ConverterUtils.DataSource dataSource = new ConverterUtils.DataSource(FILE_PATH); // original dataset Instances preparedDataSet = dataSource.getDataSet(); preparedDataSet = filterUnsupervisedAttributes(preparedDataSet); preparedDataSet.setClassIndex(7);
CRUpdateable classifier = new CRUpdateable();
RandomForest randomForest = createRandomForest(1);
classifier.setClassifier(randomForest);
Instances trainingInstances = new Instances(dataSource.getStructure()); // temporary dataset for train
trainingInstances = filterUnsupervisedAttributes(trainingInstances);
trainingInstances.setClassIndex(7);
Instances testInstances = new Instances(dataSource.getStructure()); // temporary dataset for test
testInstances = filterUnsupervisedAttributes(testInstances);
testInstances.setClassIndex(7);
int countTestInstances = 0;
int countTrainInstances = 0;
boolean firstTrain = true;
boolean benchTest = true;
int numInst = preparedDataSet.numInstances();
for(int row = 123; row < 5021; row++) {
Instance trainingInstance = preparedDataSet.instance(row);
trainingInstances.add(trainingInstance); // collect instances to use as training
countTrainInstances++;
if (firstTrain && countTrainInstances%100 == 0 ) { // train the classifier with the first 100 instances(without any missing values)
firstTrain = false;
classifier.buildClassifier(trainingInstances);
}
if(!firstTrain){
benchTest = true;
// classifier.updateClassifier(trainingInstance);
for(int j=row+1;j<row+101;j++){
if(benchTest && countTestInstances != 100) { // add next 100 instances to testInstance
Instance testInstance = preparedDataSet.instance(j);
testInstances.add(testInstance);
countTestInstances++;
if (countTestInstances % 100 == 0) {
System.out.println("Evaluate CRUpdateable classifier on ");
String top = "PCut1";
String vop = "3";
Result result = Evaluation.evaluateModel(classifier, trainingInstances , testInstances, top, vop);
System.out.println("Evaluation available metrics: " + result.availableMetrics());
System.out.println("Evaluation Info: " + result.toString());
System.out.println("Levenshtein distance: " + result.getValue("Levenshtein distance"));
System.out.println("Type: " + result.getInfo("Type"));
countTestInstances = 0;
benchTest = false;
testInstances.delete();
}
}
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
The result of Evaluation is here:
Evaluation Info: == Evaluation Info
Classifier meka.classifiers.multiltarget.incremental.CRUpdateable
Options [-W, weka.classifiers.trees.RandomForest, --, -P, 100, -I, 1, -num-slots, 1, -K, 0, -M, 1.0, -V, 0.001, -S, 1]
Additional Info
Dataset Missing_values_Predicted-weka.filters.unsupervised.attribute.RemoveType-Tstring
Number of labels (L) 7
Type MT
Verbosity 3
== Predictive Performance
N(test) 100
L 7
Hamming score 0
Exact match 0
Hamming loss 1
ZeroOne loss 1
Levenshtein distance 1
Label indices [ 0 1 2 3 4 5 6 ]
Accuracy (per label) [ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 ]
== Additional Measurements
Number of training instances 154 Number of test instances 100 Label cardinality (train set) 659.407 Label cardinality (test set) 676.757 Build Time 0.061 Test Time 0.006 Total Time 0.067
From a quick glance, you seem to treat the data like you would for Weka. However, Meka works a bit different. See the following examples:
-
preparing the class attributes -
MekaClassAttributesfilter -
train/predict -
MLUtils.prepareData(Instances)method
Final remark, you only seem to have a single class attribute...
Thanks for your answer, you mentioned good points, I changed my code and used Meka ways, now code is as under: try { ConverterUtils.DataSource dataSource = new ConverterUtils.DataSource(FILE_PATH); // original dataset Instances preparedDataSet = dataSource.getDataSet();
CRUpdateable classifier = new CRUpdateable();
RandomForest randomForest = createRandomForest(1); // random forest is not updatable classifier
classifier.setClassifier(randomForest);
Instances trainingInstances = new Instances(dataSource.getStructure());
Instances testInstances = new Instances(dataSource.getStructure());
int countTestInstances = 0;
int countTrainInstances = 0;
boolean firstTrain = true;
boolean benchTest = true;
for(int row = 123; row < 5021; row++) {
Instance trainingInstance = preparedDataSet.instance(row);
trainingInstances.add(trainingInstance); // collect instances to use as training
countTrainInstances++;
if (firstTrain && countTrainInstances%100 == 0 ) {
trainingInstances = PrepareClassAttributes(trainingInstances,"1,2,3,4,5,6,7");
firstTrain = false;
classifier.buildClassifier(trainingInstances);
}
if(!firstTrain){
benchTest = true;
classifier.updateClassifier(trainingInstance);
for(int j=row+1;j<row+101;j++){
if(benchTest && countTestInstances != 100) {
Instance testInstance = preparedDataSet.instance(j);
testInstances.add(testInstance);
countTestInstances++;
if (countTestInstances % 100 == 0) {
testInstances = PrepareClassAttributes(testInstances,"1,2,3,4,5,6,7");
System.out.println("Evaluate CRUpdateable classifier on ");
String top = "PCut1";
String vop = "3";
Result result = Evaluation.evaluateModel(classifier, trainingInstances , testInstances, top, vop);
System.out.println("Evaluation Info: " + result.toString());
countTestInstances = 0;
benchTest = false;
testInstances.delete();
}
}
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
but yet the Accuracy is zero, and the stats results are strange:
N(test) 100
L 7
Hamming score 0
Exact match 0
Hamming loss 1
ZeroOne loss 1
Levenshtein distance 1
Label indices [ 0 1 2 3 4 5 6 ]
Accuracy (per label) [ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 ]
Actually the stats results make sense given that there are 0 correct predictions. Without being familiar with your data, it is difficult to know if this is 'strange' or not. Have you tried getting results using a simple test in the GUI first? Or to print out the prediction for each instance?