ultralytics
ultralytics copied to clipboard
how to crop the result with obb model
Search before asking
- [X] I have searched the YOLOv8 issues and discussions and found no similar questions.
Question
how can i crop the detected image using obb model, currently i am using yolo standard version, i can crop the predicted image using the following statement: ''' results= license_plate_detect(image)[0] if results: # Trích xuất vị trí bounding box boxes = results.boxes.xyxy.tolist()
for i, box in enumerate(boxes):
# lấy tọa độ (x1,y1) trên cùng bên trái và (x2,y2) cuối cùng bên phải
x1, y1, x2, y2 = box
# Cắt khu vực chứa biển số để đưa vào paddleocr
license_plate_crop = image[int(y1):int(y2), int(x1):int(x2)]
''' Now I want to switch to obb model, can you guide me how to get the result with cropped image. Thank you
Additional
No response
@NguyenDucQuan12 hello! 😊 Switching to an OBB (oriented bounding box) model means you'll be working with rotated bounding boxes. The .boxes
attribute for an OBB model will contain the center coordinates, width, height, and angle in radians.
Here's how you can adjust your code to crop images based on OBB outputs:
results = license_plate_detect(image)[0]
if results:
obbs = results.boxes.xywhr.tolist() # Get OBBs in [x_center, y_center, width, height, angle] format
for i, obb in enumerate(obbs):
xc, yc, w, h, angle = obb # Unpack the OBB
# You'll need to perform additional steps to rotate and crop
# This is a simplified example, assuming `image` is your input image numpy array
center = (int(xc), int(yc))
M = cv2.getRotationMatrix2D(center, angle, 1.0) # Get rotation matrix for the given angle
# Apply affine transformation - rotating the image
rotated = cv2.warpAffine(image, M, image.shape[1::-1], flags=cv2.INTER_LINEAR)
# Cropping the rotated image around the center point
x1 = max(int(xc - w / 2), 0) # Ensuring the crop coordinates are within image bounds
y1 = max(int(yc - h / 2), 0)
x2 = min(int(xc + w / 2), image.shape[1])
y2 = min(int(yc + h / 2), image.shape[0])
license_plate_crop = rotated[y1:y2, x1:x2]
This example demonstrates the basic flow for rotating the whole image based on each OBB's angle and then cropping. Keep in mind, more complex scenarios might require additional processing for accuracy. Hope this helps! 🚗
@glenn-jocher i get image from camera through rtsp stream, and process each frame, when I detect the license plate and try to get xywhr value I get error: obbs = results.boxes.xywhr.tolist() # Get OBBs in [x_center, y_center, width, height, angle] format ^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'xywhr'
I trained the model using gg colab and got 2 pt files, I used it but it gives error.
best-yolo-obb.zip
roboflow_dataset
Here are 2 model files and images I tested
@NguyenDucQuan12 hello there! 🌞 It seems like the model files and images you've shared are related to an issue you're encountering. Could you please provide more details about the error messages or unexpected behaviors you're experiencing?
For example, if you're dealing with an AttributeError
related to xywhr
like mentioned in your previous message, please ensure that you're using a model specifically trained for OBB (oriented bounding box) prediction. If the model isn't correctly set up for OBB, this attribute won't be available.
Also, when loading and using your model, make sure it's done correctly:
from ultralytics import YOLO
# Load your custom model
model = YOLO('path/to/your/model.pt')
# Process an image
results = model('path/to/your/image.jpg')
# Now, you can access OBB attributes if your model supports it
if results.boxes and hasattr(results.boxes, 'xywhr'):
obbs = results.boxes.xywhr.tolist()
else:
print("This model does not support OBB predictions.")
Regarding the image and files you shared, ensuring they're accessible and properly linked in your script is crucial. For the images hosted on GitHub, confirm that the URLs are correct and publicly accessible.
I'd love to help you sort this out, so any additional context on the issue would be greatly appreciated! 🛠️
@NguyenDucQuan12 did you solve the boxes=None issue? I had the same problem also with obb model inference. My model can detect object just fine but the results[0] look like this which doesnt make sense
masks: None names: {0: 'Agama', 1: 'Alamat', 2: 'Berlaku', 3: 'Face', 4: 'Foto', 5: 'Goldar', 6: 'JK', 7: 'Kabupaten', 8: 'Kec', 9: 'Kel-Desa', 10: 'NAMA', 11: 'NIK', 12: 'Pekerjaan', 13: 'Provinsi', 14: 'RT-RW', 15: 'Signature', 16: 'Status', 17: 'TTL', 18: 'Warga-Negara'} obb: ultralytics.engine.results.OBB object orig_img: array([[[215, 215, 215], [215, 215, 215], [215, 215, 215], ..., [225, 225, 225], [226, 226, 226], [226, 226, 226]],
[[216, 216, 216],
[216, 216, 216],
[216, 216, 216],
...,
[223, 223, 223],
[224, 224, 224],
[224, 224, 224]],
[[217, 217, 217],
[217, 217, 217],
[217, 217, 217],
...,
[221, 221, 221],
[221, 221, 221],
[221, 221, 221]],
...,
[[107, 107, 107],
[106, 106, 106],
[103, 103, 103],
...,
[ 88, 88, 88],
[ 89, 89, 89],
[ 86, 86, 86]],
[[115, 115, 115],
[113, 113, 113],
[107, 107, 107],
...,
[ 87, 87, 87],
[ 90, 90, 90],
[ 89, 89, 89]],
[[120, 120, 120],
[116, 116, 116],
[109, 109, 109],
...,
[ 88, 88, 88],
[ 94, 94, 94],
[ 94, 94, 94]]], dtype=uint8)
orig_shape: (1600, 1200) path: '/content/grayscale_image3.jpeg' probs: None save_dir: 'runs/obb/predict' speed: {'preprocess': 7.385492324829102, 'inference': 3030.5862426757812, 'postprocess': 15.470743179321289}
why are the probs and boxes none while my model can actually detect object?
@kamilalfian Yeah I got the same issue😥😥
from ultralytics import YOLO
import cv2
import numpy as np
# Load a model
model = YOLO('runs/obb/train/weights/best.pt') # pretrained YOLOv8n model
# Run batched inference on a list of images
results = model.predict('test/image') # return a list of Results objects
image = cv2.imread('test/image')
# Process results list
# Process results list
for result in results:
obb = result.obb # Oriented boxes object for OBB outputs
point = result.obb.xywhr.tolist()
for i, ob in enumerate(point):
xc, yc, w, h, angle = ob # Unpack the OBB
# You'll need to perform additional steps to rotate and crop
# This is a simplified example, assuming `image` is your input image numpy array
center = (int(xc), int(yc))
M = cv2.getRotationMatrix2D(center, angle, 1.0) # Get rotation matrix for the given angle
# Apply affine transformation - rotating the image
rotated = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]), flags=cv2.INTER_LINEAR)
# Create a rectangle enclosing the rotated license plate
rect = ((xc, yc), (h, w), angle) # swapping w and h
box = cv2.boxPoints(rect)
box = box.astype(np.int32)
# Get the rotated bounding box coordinates
x1, y1 = np.min(box, axis=0)
x2, y2 = np.max(box, axis=0)
# Crop the rotated image
license_plate_crop = rotated[y1:y2, x1:x2]
# Save the cropped license plate
cv2.imwrite(f'license_plate_{i}.jpg', license_plate_crop)
this is how i crop the the detected image you can also refer this.
@kamilalfian I still can't solve it, the model can predict rotated objects (I use the imshow command and see that the bounding box has rotated), but cropping the image doesn't work. So I decided to return to the normal yolo version
@samthakur587 I will try tomorrow
@samthakur587 I tried your method of cropping the image, but the result does not match the bounding box
@glenn-jocher When I try again with the algorithm you provided, the results don't seem to be much different. How can I tweak the algorithm to make it better?
Hey there! It looks like you're still facing some challenges with the cropping algorithm. To refine the results, you might consider adjusting the rotation angle or the order of operations slightly. Here’s a quick tweak you can try:
Ensure the rotation angle used in the transformation matrix is correctly calculated as negative if needed, since the rotation direction can affect the final output. Also, double-check the coordinates used for cropping to ensure they align with the rotated image's dimensions.
Here's a small modification to the code snippet:
# Correct angle for rotation direction
angle = -angle # Ensure the rotation is in the correct direction
# Apply affine transformation
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]))
# Ensure coordinates are within image bounds
x1, y1 = np.clip(np.min(box, axis=0), 0, None)
x2, y2 = np.clip(np.max(box, axis=0), None, [image.shape[1], image.shape[0]])
# Crop the image
license_plate_crop = rotated[y1:y2, x1:x2]
Give this a try and let us know how it goes! 🚀
@glenn-jocher The result seems to be even worse, it seems that the rotation angle does not produce the expected results
for result in results:
obb = result.obb # Oriented boxes object for OBB outputs
point = result.obb.xywhr.tolist()
for i, ob in enumerate(point):
xc, yc, w, h, angle = ob # Unpack the OBB
angle = -angle # Ensure the rotation is in the correct direction
# You'll need to perform additional steps to rotate and crop
# This is a simplified example, assuming `image` is your input image numpy array
center = (int(xc), int(yc))
M = cv2.getRotationMatrix2D(center, angle, 1.0) # Get rotation matrix for the given angle
# Apply affine transformation - rotating the image
rotated = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]))
# Ensure coordinates are within image bounds
rect = ((xc, yc), (h, w), angle) # swapping w and h
box = cv2.boxPoints(rect)
box = box.astype(np.int32)
x1, y1 = np.clip(np.min(box, axis=0), 0, None)
x2, y2 = np.clip(np.max(box, axis=0), None, [image.shape[1], image.shape[0]])
license_plate_crop = rotated[y1:y2, x1:x2]
# Save the cropped license plate
cv2.imwrite(f'license_plate_{i}.jpg', license_plate_crop)
Hey @NguyenDucQuan12! It looks like the rotation might still be off. Sometimes, the angle provided needs to be adjusted based on how it's interpreted by the rotation function. Try converting the angle from degrees to radians or adjust the sign again. Also, ensure that the width and height are correctly assigned when setting up the rectangle for cropping. Here’s a quick tweak:
angle = angle if some_condition else -angle # Adjust based on your angle interpretation
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]))
rect = ((xc, yc), (w, h), angle) # Ensure w and h are correctly used
box = cv2.boxPoints(rect)
box = np.int0(box)
x1, y1 = np.min(box, axis=0)
x2, y2 = np.max(box, axis=0)
license_plate_crop = rotated[y1:y2, x1:x2]
Let's see if this aligns better with your expectations! 🛠️
@glenn-jocher Looks like I'll have to find a more suitable direction, thank you for your enthusiastic help.
# Load a model
model = YOLO('assest/model/best-yolo-obb.pt') # pretrained YOLOv8n model
# Run batched inference on a list of images
results = model.predict('assest/image/image_test/2019_12_16_17_54_37_PM423138689.jpg',show = True) # return a list of Results objects
image = cv2.imread('assest/image/image_test/2019_12_16_17_54_37_PM423138689.jpg')
time.sleep(3)
# Process results list
# Process results list
for result in results:
obb = result.obb # Oriented boxes object for OBB outputs
point = result.obb.xywhr.tolist()
for i, ob in enumerate(point):
xc, yc, w, h, angle = ob # Unpack the OBB
# You'll need to perform additional steps to rotate and crop
# This is a simplified example, assuming `image` is your input image numpy array
center = (int(xc), int(yc))
M = cv2.getRotationMatrix2D(center, angle, 1.0) # Get rotation matrix for the given angle
# Apply affine transformation - rotating the image
rotated = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]), flags=cv2.INTER_LINEAR)
# Cropping the rotated image around the center point
rect = ((xc, yc), (w, h), angle) # Ensure w and h are correctly used
box = cv2.boxPoints(rect)
box = np.intp(box)
x1, y1 = np.min(box, axis=0)
x2, y2 = np.max(box, axis=0)
license_plate_crop = rotated[y1:y2, x1:x2]
# Save the cropped license plate
cv2.imwrite(f'license_plate_{i}.jpg', license_plate_crop)
@NguyenDucQuan12 you're welcome! I'm glad I could help. If you decide to explore other directions, feel free to share your progress or any new challenges you encounter. The YOLO community and Ultralytics team are always here to support you. Best of luck with your project! 🚀
If you need further assistance, don't hesitate to reach out. Happy coding!
@glenn-jocher I have another question: my software is showing signs of memory leak. When I checked with memory_profile, I realized that detecting objects with yolo, and reading characters with OCR increased the memory. ram, but after detection is complete the memory is not returned properly. How can I delete those old memories?
# Tiến hành đọc các ký tự từ biển số đã được căt từ hình ảnh gốc
@profile
def get_license_plate(license_plate_crop):
is_license_plate= False #default
result_license_plate= ocrEngine.ocr(license_plate_crop, cls=True)[0]
# Chuyển đổi ảnh từ dạng mảng sang dạng rgb
license_plate_crop_cvt = Image.fromarray(license_plate_crop)
if result_license_plate:
# Ghép từng ký tự ở hai hàng của biển số lại với nhau: 38-F7
# 390.01
license_plate = [line[1][0] for line in result_license_plate]
# Viết hoa các chữ cái
license_plate = [i.upper() for i in license_plate]
license_plate =''.join(license_plate) # bỏ các khoảng trắng
is_license_plate, license_plate=license_complies_format(license_plate)
else:
license_plate='000000000'
return is_license_plate, license_plate_crop_cvt, license_plate
# Các giá trị mặc định
license_plate_error= Image.open("assest/image/img_src/error.png") #default
# Cắt ảnh với hộp tọa độ xoay
@profile
def predict(image, save = True):
is_license_plate= False #default
license_plate = error_no_license_plate #default
img_path = None #default
results = license_plate_detect(image, verbose=False)[0]
for result in results:
obb = result.obb # Oriented boxes object for OBB outputs
point = result.obb.xywhr.tolist()
for i, ob in enumerate(point):
xc, yc, w, h, angle = ob # Unpack the OBB
# You'll need to perform additional steps to rotate and crop
# This is a simplified example, assuming `image` is your input image numpy array
center = (int(xc), int(yc))
M = cv2.getRotationMatrix2D(center, angle, 1.0) # Get rotation matrix for the given angle
# Apply affine transformation - rotating the image
rotated = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]), flags=cv2.INTER_LINEAR)
# Cropping the rotated image around the center point
x1 = max(int(xc - w / 2), 0) # Ensuring the crop coordinates are within image bounds
y1 = max(int(yc - h / 2), 0)
x2 = min(int(xc + w / 2), image.shape[1])
y2 = min(int(yc + h / 2), image.shape[0])
license_plate_crop = rotated[y1:y2, x1:x2]
is_license_plate, license_plate_crop_cvt, license_plate = get_license_plate(license_plate_crop)
#lưu và hiển thị ảnh lên màn hình
if save:
img_path = save_image_license_plate(is_license_plate,license_plate_crop)
return license_plate, license_plate_crop_cvt, is_license_plate, img_path
return license_plate, license_plate_error, is_license_plate, img_path
@glenn-jocher The first time the object was detected, the memory increased to 45MB, I think because it needed to store the model, but from later times it increased to 20MB. Is there a way to delete the leftovers (leaving only the amount of memory reserved for the model) after each object detection and OCR?
Hi @NguyenDucQuan12,
Thank you for sharing the details and the memory profile. It’s indeed common for the initial memory spike to occur due to model loading. However, the subsequent increases you’re observing could be due to residual data from the detection and OCR processes.
To mitigate this, you can manually clear variables and invoke garbage collection after each detection and OCR operation. Here’s a quick example:
import gc
def clear_memory():
gc.collect()
# After detection and OCR
license_plate, license_plate_crop_cvt, is_license_plate, img_path = predict(image)
clear_memory()
This should help in freeing up memory that’s no longer in use. If the issue persists, you might want to profile specific parts of your code to pinpoint the exact source of the memory leak.
Feel free to reach out if you have any more questions! 😊
@glenn-jocher I am running multi-threaded object detection, which means there can be up to 4 object detection and OCR threads running in parallel at the same time. When I call gc.collect in any thread (because gc.collect is at the end of the OCR script), sometimes it deletes the cursor of sql server, sometimes it crashes the program and exits suddenly. out of the program. With multi-threading, the RAM increases very quickly and shows no signs of decreasing once it's done and I wait for 5 minutes, and I also can't call gc.collect because calling it continuously in the thread can cause unwanted events. How should I handle this problem to get the best results?
Hi @NguyenDucQuan12,
Managing memory in a multi-threaded environment, especially with intensive tasks like object detection and OCR, can indeed be challenging. It sounds like the use of gc.collect()
is causing some unintended side effects in your application.
In multi-threaded applications, it's generally best to minimize shared state and ensure that resources are properly managed within each thread. Here are a couple of suggestions:
-
Localize Resource Management: Ensure that each thread cleans up its own resources once it's done with them. This includes dereferencing any large objects and ensuring that database connections are properly managed.
-
Thread-Specific Data: Use thread-local data wherever possible. This can help prevent interference between threads.
-
Profiling: Since the memory isn't decreasing as expected, it might be helpful to use a profiling tool to identify specific areas where memory is not being released.
-
Database Connections: If you're using database connections in your threads, consider using a connection pool with a fixed number of connections that threads can check out and return. This can help manage database resources more efficiently and prevent issues with cursors being unexpectedly deleted.
-
Error Handling: Implement robust error handling within each thread to manage and log exceptions effectively. This can help prevent the entire program from crashing when an issue occurs in a single thread.
If these suggestions don't resolve the issue, you might need to consider restructuring parts of your application for better isolation between threads or reducing the number of concurrent threads if possible.
Hope this helps! Let us know how it goes. 😊
👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
- Docs: https://docs.ultralytics.com
- HUB: https://hub.ultralytics.com
- Community: https://community.ultralytics.com
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO 🚀 and Vision AI ⭐