svm-vehicle-detector icon indicating copy to clipboard operation
svm-vehicle-detector copied to clipboard

Inference on a single image

Open rubeea opened this issue 4 years ago • 2 comments

Hi, Thanks for this awesome project code. However, I wanted to know if we only want to detect vehicles in a single image instead of a whole video file, how can we achieve that. Thanks

rubeea avatar Jun 14 '20 17:06 rubeea

Hey @rubeea, glad you found it useful. The actual inference takes place in the Detector.detectVideo method. If you strip away the video capture stuff, the while loop, and the summing of heatmaps from previous frames, you're left with the functionality for performing inference on a single frame/image (which is really how I should have structured things, in hindsight). If you did that, the method would look something like this:

    def detectVideo(
        self, frame, threshold=120, min_bbox=None, draw_heatmap=True,
        draw_heatmap_size=0.2, 
    ):
        h, w = frame.shape[:2]

        # Store coordinates of all windows to be checked at every frame.
        self.windows = slidingWindow(
            (w, h), init_size=self.init_size, x_overlap=self.x_overlap,
            y_step=self.y_step, x_range=self.x_range, y_range=self.y_range,
            scale=self.scale
        )

        if min_bbox is None:
            min_bbox = (int(0.02 * w), int(0.02 * h))

        # Heatmap inset size.
        inset_size = (int(draw_heatmap_size * w), int(draw_heatmap_size * h))

        heatmap = np.zeros((frame.shape[:2]), dtype=np.uint8)
        heatmap_labels = np.zeros_like(heatmap, dtype=np.int)

        for (x_upper, y_upper, x_lower, y_lower) in self.classify(frame):
            heatmap[y_upper:y_lower, x_upper:x_lower] += 10

        cv2.dilate(heatmap, np.ones((7,7), dtype=np.uint8), dst=heatmap)

        if draw_heatmap:
            inset = cv2.resize(heatmap, inset_size, interpolation=cv2.INTER_AREA)
            inset = cv2.cvtColor(inset, cv2.COLOR_GRAY2BGR)
            frame[:inset_size[1], :inset_size[0], :] = inset

        # Ignore heatmap pixels below threshold.
        heatmap[heatmap <= threshold] = 0

        # Label remaining blobs with scipy.ndimage.measurements.label.
        num_objects = label(heatmap, output=heatmap_labels)

        # Determine the largest bounding box around each object.
        for obj in range(1, num_objects + 1):
            (Y_coords, X_coords) = np.nonzero(heatmap_labels == obj)
            x_upper, y_upper = min(X_coords), min(Y_coords)
            x_lower, y_lower = max(X_coords), max(Y_coords)

            # Only draw box if object is larger than min bbox size.
            if (
                x_lower - x_upper > min_bbox[0]
                and y_lower - y_upper > min_bbox[1]
            ):
                cv2.rectangle(
                    frame, (x_upper, y_upper), (x_lower, y_lower), (0, 255, 0), 6
                )

I just took a quick pass through the method and haven't actually checked that the above works, but hopefully it conveys what I'm trying to say. Let me know if it doesn't.

nrsyed avatar Jun 16 '20 03:06 nrsyed

Hi, Yeah I kind of tweaked the code in the similar manner and got it running for a single image. I have a few other questions regarding your detector:

  1. How did you come up with optimal parameters for the detection process?
  2. How can we generate negative samples for any other kind of data given we have the positive samples?

Thanks and regards, Rabeea

rubeea avatar Jun 22 '20 06:06 rubeea