X-AnyLabeling icon indicating copy to clipboard operation
X-AnyLabeling copied to clipboard

优化AI标注在小目标上的表现

Open fystero opened this issue 1 month ago • 2 comments

小目标往往相对原图来说尺寸是非常小的,而原版后台调用模型时是将整个原图输入给模型,这会导致我们要标注的目标经过模型编码器编码后只留下了非常微小的特征,进而导致解码器输出的掩码不准确 用户在标注小目标时往往需要将画面放大,只把要标注的目标及其周围区域放在画布中。利用这一点,可以让软件后台在调用模型时只把用户看到的画面而不是整个原图送入模型,这样模型接收到的目标的尺寸更大了,经编码器编码后留下的特征也更多,输出的掩码也更准 TinyObj-on

  • ✅ 提供是否开启小目标AI标注增强功能的按钮
  • ✅ 适配了所有SAM类模型
  • ✅ 保留了编码器特征缓存机制,只要画布上的画面未改变,标注下一处目标时仍可利用上一次标注产生的特征信息,加快了推理速度
  • ✅修复了用户使用多个shape作为prompts如果移动画时可能出现的bug

fystero avatar Nov 11 '25 09:11 fystero

The entry point of this improvement plan you proposed is good. I have a question though. The enlarged local image obtained by scaling the canvas is not a lossless enlargement. In the process of lossy enlargement, the texture features of the target will inevitably be lost, and this loss will be more obvious for small targets. So, is the improvement in the results obtained by using local image inference compared to full image inference significant?

zhixuwei avatar Nov 12 '25 00:11 zhixuwei

The entry point of this improvement plan you proposed is good. I have a question though. The enlarged local image obtained by scaling the canvas is not a lossless enlargement. In the process of lossy enlargement, the texture features of the target will inevitably be lost, and this loss will be more obvious for small targets. So, is the improvement in the results obtained by using local image inference compared to full image inference significant?

Before images are input into the model for inference, they will be uniformly resized to a size like 684. With this solution, if the size of the target to be annotated is originally smaller than 684, all details on the original image will be retained after resizing without any loss of details. If it is larger than 684, more details will also be preserved as much as possible after resizing.

fystero avatar Nov 14 '25 03:11 fystero

this is a nice feature to have, sam3 is very good at detecting small object or persons but the masks/bbox are not perfect and need manual adjustment, i hope this will get included soon!

s3ni0r avatar Dec 06 '25 20:12 s3ni0r

TinyObj Mode Feature Release

We've implemented a new TinyObj mode for Segment Anything models to improve small object detection accuracy in high-resolution images.

What's New:

  • Local cropping around rectangle prompts with configurable padding
  • Automatic processing of cropped regions and mapping back to original coordinates
  • Available for Segment Anything 2 and GroundingSAM2 models
  • Configurable padding_ratio parameter (default: 20%) in model config files

How to Use: Enable the TinyObj button when working with high-resolution images containing small objects. The feature automatically crops around rectangle prompts, processes the region, and maps results back to the original image coordinates.

Update: This feature is now available in v3.3.3. Update to the latest version to try it out.

Acknowledgments: Special thanks to @fystero for providing the original idea and approach that inspired this implementation.

For more details, please refer to the segmentation example documentation.

https://github.com/user-attachments/assets/1d4b1071-29ed-4e4f-843d-1c77772c05c4

CVHub520 avatar Dec 07 '25 08:12 CVHub520