Implemented Todo (RGB image conversion)
Hey,
When I tried out the image segmentation pipeline, I got an error message saying that some method in RawImage wasn't implemented yet and is in todo. https://github.com/ankane/informers/blob/master/lib/informers/utils/image.rb#L74-L80
irb(main):001> segmenter = Informers.pipeline("image-segmentation")
irb(main):002> segmenter.("image.png")
informers/lib/informers/utils/image.rb:79:in 'Informers::Utils::RawImage#rgb': not implemented yet (Informers::Todo)
I made an attempt at implementing it and getting it to work.
A:\informers\libs>irb -r 'ruby-vips'
irb(main):001> require_relative 'informers'
irb(main):002> segmenter = Informers.pipeline("image-segmentation")
=>No model specified. Using default model: "Xenova/detr-resnet-50-panoptic".
irb(main):003> segmenter.("cat-chonk.png")
=>[{label: "snow", score: 0.9971328205343051},
{label: "LABEL_184", score: 0.9864243987171203},
{label: "cat", score: 0.9961303007760267}]
I also tried to find a similar image as the test case def test_image_segmentation to make sure that the implementation actually segments properly. When comparing the results to the test case:
def test_image_segmentation
segmenter = Informers.pipeline("image-segmentation")
result = segmenter.("test/support/pipeline-cat-chonk.jpeg")
assert_equal 3, result.size
assert_equal "snow", result[0][:label]
assert_in_delta 0.997, result[0][:score]
assert_equal "LABEL_184", result[1][:label]
assert_in_delta 0.993, result[1][:score]
assert_equal "cat", result[2][:label]
assert_in_delta 0.998, result[2][:score]
end
It gives almost identical numbers, I'd call it a success.
I also refactored the previous initial check in the method because it only verified if @channels == 3. So other cases were 3 channel but not sRGB (like LAB) could lead to prematurely returning images.
The changes do the following:
- Convert to sRGB colorspace
- If the sRGB image has an alpha channel, flatten it
- Ensure it's 3 band by checking If it's 1 band (like grayscale), then convert it to 3 band sRGB by replicating the channel
- Lastly, If it's still 3 bands but not sRGB, try to set interpretation
- If it's still not 3 band sRGB, throw an error