android-yolo-v2 icon indicating copy to clipboard operation
android-yolo-v2 copied to clipboard

About misjudgment measures

Open keides2 opened this issue 6 years ago • 6 comments

@szaza

Hi, The results learned with darknet are very good results (loss = 0.15, avg loss = 0.13, precison = 0.87, recall = 0.97, F1 - score = 0.92, mAP = 0.96), but converting the weights file to a pb file afterwards, when this pb file is loaded on andoroid-yolo-v2 and detection of an object is done, it reacts too quickly and classifies the photographed object instantly, so it seems to be erroneously judging .

Can I change the code so that object detection starts after the same object appears for 1 second or more?

Or, I would like to make a change so that the same detection result will be the detection result for the first time three consecutive times, where should I fix the code?

Thnak you in advance,

keides2

keides2 avatar Nov 07 '18 05:11 keides2

Hi, "Can I change the code so that object detection starts after the same object appears for 1 second or more?" - yes, you can modify the code accordingly to this proposal, however it sounds like a hack. I think this is not the proper way to solve this issue. I don't think either that the darknet implementation contains such a mechanism.

"Or, I would like to make a change so that the same detection result will be the detection result for the first time three consecutive times, where should I fix the code?" - this solution also seems like a hack for me, only difference is that maybe it is easier to implement.

Please have a look at the YOLOClassifier.java and put breakpoints into the classifyImage() method and try to investigate the results what you get. Probably it is easier to debug the YOLOClassifier.java by using this implementation: https://github.com/szaza/tensorflow-example-java; This application also uses the YOLOClassifier, however you can specify the input images one by one. Best wishes for your work and please notify me about your results!

Best regards, Zoltan

szaza avatar Nov 07 '18 06:11 szaza

@szaza

Thank you for your reply. It seems that it will take time, but I'd like to investigate and try the modification.

Keides2

keides2 avatar Nov 09 '18 02:11 keides2

@szaza With the gradle of Android Studio, I finally got an environment where I can debug. At the breakpoint, when the debugger exits the for loop, title of priorityQueue becomes cow and bird. However, I can not understand this result, so I do not know how to change it. debug

keides2 avatar Nov 12 '18 07:11 keides2

Hi, The priority queue contains those detected objects which confidence were higher than the specified confidence limit. As I can see you have four elements in the priority queue and the detected objects are cow and bird. For each element you can see a confidence value which shows you what is the chance for the detected object being from a given class. For example, for the first element the detector says that the chance of being cow is 88%. The location attribute gives you information about the location of the bounding box inside the image.

szaza avatar Nov 12 '18 07:11 szaza

Hi @szaza,

Long-time no see. This year I will change this code again.

Since the object detection judgment is quick, the following improvements have been made.

[specification] If the same object is detected four times in a row, it is determined that a new object has been detected, the name of the object is displayed in the display field, and a description of the object is spoken.

The modified code is as follows:

public class ClassifierActivity extends TextToSpeechActivity implements OnImageAvailableListener {
	...
    // keides2 added
    private String lastRecognizedClass = "";
    private String nowRecognizedClass = "";
    private String tts = "";
    private String msgInf1 = "";
    private String msgInf2 = "";
    private int matchCount = 0;
    private String lastResult = "";
    private String nowResult = "";

    @Override
    public void onPreviewSizeChosen(final Size size, final int rotation) { ... }

    @Override
    public void onImageAvailable(final ImageReader reader) {
        Image image = null;

        try {
			... 
        } catch (final Exception ex) {
			...
        }

        runInBackground(this::run);
    }

    private String makeTts(String nowRecognizedClass) {
        switch (nowRecognizedClass) {
            case "C型リモコン":	// Object-0
                msgInf1 = "C型リモコンのケースは、13番、";	// Speak message 0-1
                msgInf2 = "プリント基板は19番です。";		// Speak message 0-2
                break;
            case "軍手・革手":	// Object-1
                msgInf1 = "軍手、耐カット軍手、皮手は、";		// Speak message 1-1
                msgInf2 = "7番です。";					// Speak message 1-1
                break;
				
				...

            default:
                msgInf1 = "";
                msgInf2 = "";
                break;
        }
        return msgInf1 + msgInf2;
    }

    private void fillCroppedBitmap(final Image image) { ... }

    @Override
    protected void onDestroy() { ... }

    private void renderAdditionalInformation(final Canvas canvas) {
		...

		// keides2 added

		String msgInf = makeTts(nowRecognizedClass);

		lines.add(msgInf1);
		lines.add(msgInf2);

        borderedText.drawLines(canvas, 10, 10, lines);
    }

    private void run() {
        final long startTime = SystemClock.uptimeMillis();
        final List<Recognition> results = recognizer.recognizeImage(croppedBitmap);
        lastProcessingTimeMs = SystemClock.uptimeMillis() - startTime;
        overlayView.setResults(results);

        // keides2
        Log.d(LOGGING_TAG, "Debug: runInBackground()");

        if (!(results.isEmpty())) {
            nowResult = results.get(0).getTitle();
            Log.d(LOGGING_TAG, String.format("Find: %s", nowResult));

            if (!(results.isEmpty()) && lastResult.equals(nowResult)) {
                matchCount++;
                if (matchCount > 3) {
                    // match 4 times
                    matchCount = 0;

                    if (!(results.isEmpty() || lastRecognizedClass.equals(nowResult))) {
                        nowRecognizedClass = nowResult;
                        Log.d(LOGGING_TAG, "Match 4 times: " + nowRecognizedClass);
                        lastRecognizedClass = nowRecognizedClass;

                        tts = makeTts(nowRecognizedClass);
                        Log.d(LOGGING_TAG, "makeTts(): " + tts);
                        if (!(tts.equals(""))) {
                            speak2(results, tts);
                            tts = "";
                        }
                    }
                } else {
                    // nothing to do
                }
            } else {
                matchCount = 0;
            }
            lastResult = nowResult;
            Log.d(LOGGING_TAG, String.format("matchCount: %d", matchCount));
        }

        requestRender();
        computing = false;
    }
}

Thank you, keides2

keides2 avatar Jul 22 '19 04:07 keides2

Hi, I'm very glad that you improved the existing code. Could you please create a new branch with the changes and create a pull request? Thanks very much for your suggestions!

szaza avatar Jul 23 '19 05:07 szaza