generative-ai-python icon indicating copy to clipboard operation
generative-ai-python copied to clipboard

(google-generativeai: 0.8.1) Send the transparency PNG but look like the "gemini-pro" convert it to jpg.

Open Pjumpod opened this issue 1 year ago • 14 comments

Description of the bug:

My code is

> if imageext.upper() == ".PNG":
>         print("Make blank")
>         rez_img = rez_img.convert("RGBA")
>         print(rez_img.mode)
>         resize_img_path = os.path.join(save_path,"rez_" + os.path.basename(img_path))
>         rez_img.save(resize_img_path)
>         rez_img = pilimg.open(img_path)
>         print(getattr(rez_img, "get_format_mimetype", None))
>     model_use = genai.GenerativeModel(model_name=model)
>     try:
>         response = model_use.generate_content([system_prompt, rez_img], safety_settings=safety_settings)
>         response_text = str(response.text)
>     except Exception as e:
>         response_text = str(f"{e}")

image image

and here is the output/ `

Make blank RGBA <bound method ImageFile.get_format_mimetype of <PIL.PngImagePlugin.PngImageFile image mode=RGBA size=3375x1894 at 0x1A0F83C6E80>>

` image

From this link, It should upload from generate_content as PNG and transparency mode. as show in #523 but when I got the output of "describe the image", I found the word, "on black background" which is mean the PNG with RGBA was convert to RGB.

Actual vs expected behavior:

expect to upload as PNG with RGBA. but actual still RGB.

image image

Any other information you'd like to share?

google-generativeai 0.8.1

ByeIO may not able to decode the alpha channel of an image. I attached this in the code review. image and Here is from stackoverflow.

Pjumpod avatar Sep 23 '24 15:09 Pjumpod

Hi @Pjumpod

I've tested the code and images you mentioned, and it works correctly, producing a light blue background. You can check it out in this gist. I don't believe the issue lies with BytesIO. The format parameter used when saving the PIL image ensures that transparency is preserved.

Thanks

manojssmk avatar Sep 24 '24 09:09 manojssmk

@manojssmk the light blue that show in this issue is the color from my app background, not the actually picture.

The picture should not have any background. (Transparency.)

Pjumpod avatar Sep 24 '24 10:09 Pjumpod

@manojssmk you might try with your gist again with my test set pictures. btc A F R T

Pjumpod avatar Sep 24 '24 11:09 Pjumpod

@manojssmk you have to use the picture with rgba mode which it is the blank background. (Background should not have any color).

When you got the light blue background, that also showing your answer is still wrong.

Pjumpod avatar Sep 25 '24 04:09 Pjumpod

Hi @Pjumpod

Yes, you're correct. The image with a blank background that was passed to the model is producing an incorrect output, showing the background as black. You can find the code in this gist.

Thanks

manojssmk avatar Sep 25 '24 06:09 manojssmk

Hi @Pjumpod

Yes, you're correct. The image with a blank background that was passed to the model is producing an incorrect output, showing the background as black. You can find the code in this gist.

Thanks

@manojssmk @MarkDaoust this is great, now we are in sync. I think this can fix on the server site to convert the picture to RGBA mode follow by mime type. or I am not sure if anything can fix on API at client site?

Pjumpod avatar Sep 25 '24 07:09 Pjumpod

I haven't looked into this. But the behavior will be affected by: https://github.com/google-gemini/generative-ai-python/pull/570, that PR ensures that we don't process the images before sending them.

Try installing from main:

pip install git+https://github.com/google-gemini/generative-ai-python

But it's possible that the PR doesn't change anything: the API may handle the alpha channel by showing the model the picture over a black background. If the API isn't passing an actual alpha channel, there's not much I can do in the SDK.

MarkDaoust avatar Sep 25 '24 13:09 MarkDaoust

Testing a bit, I'm just not convinced that the model uses the alpha channel at all.

  • If I make an image totally transparent the model still describes it.
  • If I ask it why I can't see anything in the image pro says "I can see it, maybe there's something wrong with your display"
  • If I set different colors of transparent sections, the model reports the "correct" background color.

b/369593779

MarkDaoust avatar Sep 25 '24 14:09 MarkDaoust

Testing a bit, I'm just not convinced that the model uses the alpha channel at all.

  • If I make an image totally transparent the model still describes it.
  • If I ask it why I can't see anything in the image pro says "I can see it, maybe there's something wrong with your display"
  • If I set different colors of transparent sections, the model reports the "correct" background color.

do you have any idea or if google can help?

Pjumpod avatar Sep 25 '24 15:09 Pjumpod

I think this is happening in the API backend. I think there's nothing we can do from out here.

MarkDaoust avatar Sep 25 '24 16:09 MarkDaoust

Do you have any idea to report this bug to backend?

Pjumpod avatar Sep 26 '24 00:09 Pjumpod

@Pjumpod , I did this morning. The b/369593779 in my previous message was an internal bug reference.

MarkDaoust avatar Sep 26 '24 00:09 MarkDaoust

@MarkDaoust long time no see, Do you have any update from the API backend?

Pjumpod avatar Oct 16 '24 03:10 Pjumpod

@MarkDaoust, I have bumped into a similar situation. Is this issue fixed or is there any workaround?

CharanTeja1005 avatar Mar 06 '25 04:03 CharanTeja1005