position, colour, and background colour of text labels in draw_bounding_boxes
🚀 The feature
Text labels from torchvision.utils.draw_bounding_boxes are currently always inside the box with origin at the top left corner of the box, without a background colour, and the same colour as the bounding box itself. These are three things that would be nice to control.
Motivation, pitch
The problem with the current implementation is that it makes it hard to read the label, particularly when the bounding box is filled (because the text has the same colour as the filling colour and is placed inside the box.
For example, this is the results from the current implementation:
Moving the label to outside the box already makes things better:
But by controlling those three things (placement of label, background colour behind the label, and text colour) one could fit to whatever they have. For what is worth, in the original issue for this feature, the only example image had labels outside the box, text coloured different from the box (black), and background of the same colour as the box. See https://github.com/pytorch/vision/issues/2556#issuecomment-671344086
I'm happy to contribute this but want to know if this will be accepted and with what interface.
Thanks for opening this issue @carandraug. I think the proposal is reasonable, the current position of the label does make them difficult to read.
In terms of API / functionality, what exactly would you have in mind?
Hi, I'd have the same request!
-
Option 1 How about adding a 'text_position' argument to torchvision.utils.draw_bounding_boxes, with options 'inside', 'above', 'below', 'left', 'right', and 'auto' (with auto taking one of the options as default but falling back to different ones if text would end up outside of the image or on top of other text). I guess, auto would be a next iteration :)
-
Option 2 The same 'text_position' argument is a function returning x,y coordinates where the text should start w.r.t. bbox
In terms of API / functionality, what exactly would you have in mind?
For the colours, I think something like draw_label_kwargs could then be passed to draw.text. This exposes all the flexibility that PIL provides us.
For label positioning, there is a lot of things going on. There is the position to the box, and then whether the label goes outside or inside the box. I think we could just copy matplotlib syntax for placement of the legend box. Effectively, a string defines location ("upper right" or "center left") and a tuple defines a sort of offset to that (see bbox_to_anchor). This stackoverflow answer explains it it detail https://stackoverflow.com/a/43439132 )
That SGTM, thanks for the details. Provided that the licensing terms of matplotlib allow it, I hope there exist a piece of code that we can just copy/paste from matplotlib to get the label position based on the user-defined parameter and on the current bbox position. It would be a lot more complex if we had to implement all of that logic ourselves.
That SGTM, thanks for the details.
Can you confirm that both (text colour/font/etc and label positioning) look good? Just want to make sure that you're not only referring to the label positioning? I If so, I will prepare a PR for text colour/etc first and then another for the label positioning.
Provided that the licensing terms of matplotlib allow it, I hope there exist a piece of code that we can just copy/paste from matplotlib to get the label position based on the user-defined parameter and on the current bbox position. It would be a lot more complex if we had to implement all of that logic ourselves.
I'll take a look at matplotlib license and logic but we probably can't use it as is. At the very least, we need to handle the case where placing the label outside the resion places the label outside the image. With matplotlib, the plot has marging but we do not. I'll experiment and see what I think is the most reasonable behaviour when I start coding but I think it will be try to place the closest to the desired position while ensuring that the text stays inside the image.
I If so, I will prepare a PR for text colour/etc first and then another for the label positioning
Yes, that sounds good. Happy to consider better default for these. Thank you!