svm-vehicle-detector
svm-vehicle-detector copied to clipboard
Inference on a single image
Hi, Thanks for this awesome project code. However, I wanted to know if we only want to detect vehicles in a single image instead of a whole video file, how can we achieve that. Thanks
Hey @rubeea, glad you found it useful. The actual inference takes place in the Detector.detectVideo method. If you strip away the video capture stuff, the while loop, and the summing of heatmaps from previous frames, you're left with the functionality for performing inference on a single frame/image (which is really how I should have structured things, in hindsight). If you did that, the method would look something like this:
def detectVideo(
self, frame, threshold=120, min_bbox=None, draw_heatmap=True,
draw_heatmap_size=0.2,
):
h, w = frame.shape[:2]
# Store coordinates of all windows to be checked at every frame.
self.windows = slidingWindow(
(w, h), init_size=self.init_size, x_overlap=self.x_overlap,
y_step=self.y_step, x_range=self.x_range, y_range=self.y_range,
scale=self.scale
)
if min_bbox is None:
min_bbox = (int(0.02 * w), int(0.02 * h))
# Heatmap inset size.
inset_size = (int(draw_heatmap_size * w), int(draw_heatmap_size * h))
heatmap = np.zeros((frame.shape[:2]), dtype=np.uint8)
heatmap_labels = np.zeros_like(heatmap, dtype=np.int)
for (x_upper, y_upper, x_lower, y_lower) in self.classify(frame):
heatmap[y_upper:y_lower, x_upper:x_lower] += 10
cv2.dilate(heatmap, np.ones((7,7), dtype=np.uint8), dst=heatmap)
if draw_heatmap:
inset = cv2.resize(heatmap, inset_size, interpolation=cv2.INTER_AREA)
inset = cv2.cvtColor(inset, cv2.COLOR_GRAY2BGR)
frame[:inset_size[1], :inset_size[0], :] = inset
# Ignore heatmap pixels below threshold.
heatmap[heatmap <= threshold] = 0
# Label remaining blobs with scipy.ndimage.measurements.label.
num_objects = label(heatmap, output=heatmap_labels)
# Determine the largest bounding box around each object.
for obj in range(1, num_objects + 1):
(Y_coords, X_coords) = np.nonzero(heatmap_labels == obj)
x_upper, y_upper = min(X_coords), min(Y_coords)
x_lower, y_lower = max(X_coords), max(Y_coords)
# Only draw box if object is larger than min bbox size.
if (
x_lower - x_upper > min_bbox[0]
and y_lower - y_upper > min_bbox[1]
):
cv2.rectangle(
frame, (x_upper, y_upper), (x_lower, y_lower), (0, 255, 0), 6
)
I just took a quick pass through the method and haven't actually checked that the above works, but hopefully it conveys what I'm trying to say. Let me know if it doesn't.
Hi, Yeah I kind of tweaked the code in the similar manner and got it running for a single image. I have a few other questions regarding your detector:
- How did you come up with optimal parameters for the detection process?
- How can we generate negative samples for any other kind of data given we have the positive samples?
Thanks and regards, Rabeea