sahi
sahi copied to clipboard
fix for using bgr image in inference instead of rgb
Summary
I was using "predict" cli tool for video inference and realized this issue after weird inference results. When I checked the implementation I noticed the colro space conversion from BGR2RGB is not done for video inputs, while it is done for image inputs. As a result it provides 2 different results
- when you chop a video to frames and give the folder path of images as input
- when you give the video path as input
Here is my command:
sahi predict --slice_width 1080 --slice_height 1080 --overlap_height_ratio 0.2 --overlap_width_ratio 0.2 --model_confidence_threshold 0.25 --model_path "<my-yolo-model>.pt" --model_type yolov5 --source "20240315_145057000_iOS.mp4" --export_crop
Details:
- When you provide the an image directory or image path to the tool, sahi reads images with cv2 and does color space conversion from BGR to RGB(since cv2 reads images with BGR color space).
- Then RGB image is fed into the inference with yolo model
- After the annotations, again a color space is changed from RGB to BGR and saved
- However when you provide a video path, this RGB2BGR and BGR2RGB conversions are not performed. As a result, the inference is done with BGR image even if the model is trained with RGB color space
- I noticed the issue when I see the detection crops which is saved after the colorspace conversion. However since the input image is not converted to BGR the conversion on
crop_object_predictions
function results an output image with incorrect color space.
Reproduce:
You can reproduce the issue with any video and any yolo model (add --export_crop option to observe oddness on color space).