informers icon indicating copy to clipboard operation
informers copied to clipboard

Implemented Todo (RGB image conversion)

Open Beyarz opened this issue 6 months ago • 0 comments

Hey,

When I tried out the image segmentation pipeline, I got an error message saying that some method in RawImage wasn't implemented yet and is in todo. https://github.com/ankane/informers/blob/master/lib/informers/utils/image.rb#L74-L80

irb(main):001> segmenter = Informers.pipeline("image-segmentation")
irb(main):002> segmenter.("image.png")
informers/lib/informers/utils/image.rb:79:in 'Informers::Utils::RawImage#rgb': not implemented yet (Informers::Todo)

I made an attempt at implementing it and getting it to work.

A:\informers\libs>irb -r 'ruby-vips'
irb(main):001> require_relative 'informers'
irb(main):002> segmenter = Informers.pipeline("image-segmentation")
=>No model specified. Using default model: "Xenova/detr-resnet-50-panoptic".

irb(main):003> segmenter.("cat-chonk.png")
=>[{label: "snow", score: 0.9971328205343051},
 {label: "LABEL_184", score: 0.9864243987171203},
 {label: "cat", score: 0.9961303007760267}]

I also tried to find a similar image as the test case def test_image_segmentation to make sure that the implementation actually segments properly. When comparing the results to the test case:

  def test_image_segmentation
    segmenter = Informers.pipeline("image-segmentation")
    result = segmenter.("test/support/pipeline-cat-chonk.jpeg")
    assert_equal 3, result.size

    assert_equal "snow", result[0][:label]
    assert_in_delta 0.997, result[0][:score]
    assert_equal "LABEL_184", result[1][:label]
    assert_in_delta 0.993, result[1][:score]
    assert_equal "cat", result[2][:label]
    assert_in_delta 0.998, result[2][:score]
  end

It gives almost identical numbers, I'd call it a success.

I also refactored the previous initial check in the method because it only verified if @channels == 3. So other cases were 3 channel but not sRGB (like LAB) could lead to prematurely returning images.

The changes do the following:

  • Convert to sRGB colorspace
  • If the sRGB image has an alpha channel, flatten it
  • Ensure it's 3 band by checking If it's 1 band (like grayscale), then convert it to 3 band sRGB by replicating the channel
  • Lastly, If it's still 3 bands but not sRGB, try to set interpretation
  • If it's still not 3 band sRGB, throw an error

Beyarz avatar Jun 24 '25 22:06 Beyarz