meka icon indicating copy to clipboard operation
meka copied to clipboard

Unexpected Meka Evaluation Result

Open Mali-DS opened this issue 7 years ago • 3 comments

Hi, The result of my evaluation is zero and I don't know why? my code is here: try { ConverterUtils.DataSource dataSource = new ConverterUtils.DataSource(FILE_PATH); // original dataset Instances preparedDataSet = dataSource.getDataSet(); preparedDataSet = filterUnsupervisedAttributes(preparedDataSet); preparedDataSet.setClassIndex(7);

        CRUpdateable classifier = new CRUpdateable();
        RandomForest randomForest = createRandomForest(1);
        classifier.setClassifier(randomForest);

        Instances  trainingInstances = new Instances(dataSource.getStructure()); // temporary dataset for train
        trainingInstances = filterUnsupervisedAttributes(trainingInstances);
        trainingInstances.setClassIndex(7);

        Instances testInstances = new Instances(dataSource.getStructure()); // temporary dataset for test
        testInstances = filterUnsupervisedAttributes(testInstances);
        testInstances.setClassIndex(7);
        int countTestInstances = 0;
        int countTrainInstances = 0;
        boolean firstTrain = true;
        boolean benchTest = true;
        int numInst = preparedDataSet.numInstances();
        for(int row = 123; row < 5021; row++) {
                Instance trainingInstance = preparedDataSet.instance(row);
                trainingInstances.add(trainingInstance); // collect instances to use as training
                countTrainInstances++;
                if (firstTrain && countTrainInstances%100 == 0 ) {  // train the classifier with the first 100 instances(without any missing values)
                    firstTrain = false;
                    classifier.buildClassifier(trainingInstances);
                }
                if(!firstTrain){
                    benchTest = true;

// classifier.updateClassifier(trainingInstance);

                    for(int j=row+1;j<row+101;j++){
                        if(benchTest && countTestInstances != 100) { // add next 100 instances to testInstance
                            Instance testInstance = preparedDataSet.instance(j);
                            testInstances.add(testInstance);
                            countTestInstances++;

                            if (countTestInstances % 100 == 0) {
                                System.out.println("Evaluate CRUpdateable classifier on ");
                                String top = "PCut1";
                                String vop = "3";
                                Result result = Evaluation.evaluateModel(classifier, trainingInstances , testInstances, top, vop);
                                System.out.println("Evaluation available metrics: " + result.availableMetrics());
                                System.out.println("Evaluation Info: " + result.toString());
                                System.out.println("Levenshtein distance: " + result.getValue("Levenshtein distance"));
                                System.out.println("Type: " + result.getInfo("Type"));
                                countTestInstances = 0;
                                benchTest = false;
                                testInstances.delete();
                            }
                        }
                    }
                }
        }
    } catch (Exception e) {
        e.printStackTrace();
    }

The result of Evaluation is here:

Evaluation Info: == Evaluation Info

Classifier meka.classifiers.multiltarget.incremental.CRUpdateable Options [-W, weka.classifiers.trees.RandomForest, --, -P, 100, -I, 1, -num-slots, 1, -K, 0, -M, 1.0, -V, 0.001, -S, 1] Additional Info
Dataset Missing_values_Predicted-weka.filters.unsupervised.attribute.RemoveType-Tstring Number of labels (L) 7 Type MT Verbosity 3

== Predictive Performance

N(test) 100 L 7
Hamming score 0
Exact match 0
Hamming loss 1
ZeroOne loss 1
Levenshtein distance 1
Label indices [ 0 1 2 3 4 5 6 ] Accuracy (per label) [ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 ]

== Additional Measurements

Number of training instances 154 Number of test instances 100 Label cardinality (train set) 659.407 Label cardinality (test set) 676.757 Build Time 0.061 Test Time 0.006 Total Time 0.067

Mali-DS avatar Oct 28 '18 23:10 Mali-DS

From a quick glance, you seem to treat the data like you would for Weka. However, Meka works a bit different. See the following examples:

Final remark, you only seem to have a single class attribute...

fracpete avatar Oct 29 '18 00:10 fracpete

Thanks for your answer, you mentioned good points, I changed my code and used Meka ways, now code is as under: try { ConverterUtils.DataSource dataSource = new ConverterUtils.DataSource(FILE_PATH); // original dataset Instances preparedDataSet = dataSource.getDataSet();

        CRUpdateable classifier = new CRUpdateable();
        RandomForest randomForest = createRandomForest(1);  // random forest is not updatable classifier
        classifier.setClassifier(randomForest);

        Instances  trainingInstances = new Instances(dataSource.getStructure()); 
        Instances testInstances = new Instances(dataSource.getStructure());
        int countTestInstances = 0;
        int countTrainInstances = 0;
        boolean firstTrain = true;
        boolean benchTest = true;
        for(int row = 123; row < 5021; row++) {
                Instance trainingInstance = preparedDataSet.instance(row);
                trainingInstances.add(trainingInstance); // collect instances to use as training
                countTrainInstances++;
                if (firstTrain && countTrainInstances%100 == 0 ) { 
                    trainingInstances = PrepareClassAttributes(trainingInstances,"1,2,3,4,5,6,7");
                    firstTrain = false;
                    classifier.buildClassifier(trainingInstances);
                }
                if(!firstTrain){
                    benchTest = true;
                    classifier.updateClassifier(trainingInstance);
                    for(int j=row+1;j<row+101;j++){
                        if(benchTest && countTestInstances != 100) { 
                            Instance testInstance = preparedDataSet.instance(j);
                            testInstances.add(testInstance);
                            countTestInstances++;
                            if (countTestInstances % 100 == 0) {
                                testInstances = PrepareClassAttributes(testInstances,"1,2,3,4,5,6,7");
                                System.out.println("Evaluate CRUpdateable classifier on ");
                                String top = "PCut1"; 
                                String vop = "3";  
                                Result result = Evaluation.evaluateModel(classifier, trainingInstances , testInstances, top, vop);
                                System.out.println("Evaluation Info: " + result.toString());
                                countTestInstances = 0;
                                benchTest = false;
                                testInstances.delete();
                            }
                        }
                    }
                }
        }

    } catch (Exception e) {
        e.printStackTrace();
    }

but yet the Accuracy is zero, and the stats results are strange:

N(test) 100 L 7
Hamming score 0
Exact match 0
Hamming loss 1
ZeroOne loss 1
Levenshtein distance 1
Label indices [ 0 1 2 3 4 5 6 ] Accuracy (per label) [ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 ]

Mali-DS avatar Oct 29 '18 09:10 Mali-DS

Actually the stats results make sense given that there are 0 correct predictions. Without being familiar with your data, it is difficult to know if this is 'strange' or not. Have you tried getting results using a simple test in the GUI first? Or to print out the prediction for each instance?

jmread avatar Oct 30 '18 06:10 jmread