Added Gemini Multimodal (Text+Image) Tutorial - Issue #549

Overview

This PR adds a comprehensive tutorial notebook demonstrating how to build a multimodal chatbot using Google's Gemini API. The notebook provides step-by-step guidance on building a solution that can process both text and images simultaneously.

Features Covered

Natural language text processing
Image analysis capabilities
Combined text+image input handling (true multimodal interaction)
Conversation history management with visual context
Response formatting with markdown
Real-time response streaming
Parameter customization for different use cases
Basic error handling and rate limiting

Technical Requirements

Google API key for Gemini
Python 3.9+
Libraries: google-generativeai, pillow, IPython, pandas, matplotlib

Why This Matters

This tutorial fills a documentation gap by showing how to seamlessly integrate text and vision capabilities in a single application, properly manage conversation history with visual context, and optimize parameters for different response types.

Testing Done

Verified all code samples execute properly
Tested with various image types and prompt combinations
Ensured compatibility with current API version

Colab Link

Run in Google Colab

Mar 11 '25 07:03 Bhavesh2k4

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Mar 11 '25 07:03 review-notebook-app[bot]

View / edit / reply to this conversation on ReviewNB

Giom-V commented on 2025-03-24T15:22:43Z ----------------------------------------------------------------

Line #16.                    'Input Types': model.input_token_limit if hasattr(model, 'input_token_limit') else 'Unknown',

I think "Input types" should be "Input limit"

Mar 24 '25 15:03 review-notebook-app[bot]

View / edit / reply to this conversation on ReviewNB

Giom-V commented on 2025-03-24T15:22:44Z ----------------------------------------------------------------

Line #2.    MODEL_NAME = "gemini-1.5-flash"

Can you use a selector like this?

MODEL_NAME="gemini-2.0-flash" # @param ["gemini-2.0-flash-lite","gemini-2.0-flash","gemini-2.0-pro-exp-02-05"] {"allow-input":true, isTemplate: true}

As it would make the notebook easier to maintain in the future.

Maybe also rename it to "DEFAULT_MODEL_NAME" as I saw that you can overwrite it when initializing GeminiMultimodalChatBot

Mar 24 '25 15:03 review-notebook-app[bot]

View / edit / reply to this conversation on ReviewNB

Giom-V commented on 2025-03-24T15:22:45Z ----------------------------------------------------------------

Line #17.            temperature: float = 0.7,

It might be interesting to explain why you chose those values. I think most are the default ones, but I don't think it's the case for the temperature at least.

Mar 24 '25 15:03 review-notebook-app[bot]

View / edit / reply to this conversation on ReviewNB

Giom-V commented on 2025-03-24T15:22:46Z ----------------------------------------------------------------

Line #63.                    {"role": "user", "parts": ["I need you to follow these instructions: " + system_prompt]},

Any reason why you are not using the system instruction field from the config? Here's an example for chat.

Mar 24 '25 15:03 review-notebook-app[bot]

View / edit / reply to this conversation on ReviewNB

Giom-V commented on 2025-03-24T15:22:47Z ----------------------------------------------------------------

Line #3.    You are a helpful, friendly assistant. When responding to questions:

For readability, can you just add some tabulation to the system prompt? Like

system_prompt = """
  You are a helpful, friendly assistant. When responding to questions:
  - If you're unsure, be honest about your limitations
  - Provide detailed and accurate information
  - For image analysis, describe what you see in detail
  - Use markdown formatting to make responses easy to read
  - When discussing code, include well-commented examples
"""

Mar 24 '25 15:03 review-notebook-app[bot]

View / edit / reply to this conversation on ReviewNB

Giom-V commented on 2025-03-24T15:22:47Z ----------------------------------------------------------------

Can you add slightly more text to the API ref, related examples and continue your discovery? At the very list add bullet points.

Mar 24 '25 15:03 review-notebook-app[bot]

@Bhavesh2k4 Thanks for the great submission. I am finally back from my sick leave and found the time to review it.

I just added a couple of minor comments to make the notebook easier to understand and to maintain.

Can you also check the lint and format failures and fix them (format is likely because you haven't run the formatting script, lint because a "we" needs to be changed into a "you", and you likely have to update the README.md to add a link to your new notebook).

Thanks again!

Mar 24 '25 15:03 Giom-V

One last thing, I think it should be moved to the examples/ folder.

Mar 24 '25 15:03 Giom-V

View / edit / reply to this conversation on ReviewNB

Giom-V commented on 2025-03-24T16:01:28Z ----------------------------------------------------------------

I think you need to move that button down to replace the "run in colab" one as I did in #512 (make sure you use the same size).

Mar 24 '25 16:03 review-notebook-app[bot]

Hi @Giom-V ,

Thank you so much for reviewing my notebook, especially after returning from sick leave. I appreciate your detailed feedback and the effort you've put into helping me improve the submission.

I apologize for the delayed response. I was in the middle of my university internals, which account for 25% of my grade points, so I couldn't address the comments immediately.

I'll work on making the changes you suggested:

Fixing the formatting
Making the changes in the Notebook file &
Updating the README.md with the new link

I noticed the linting failed primarily due to a URL issue. I'm a bit confused about the notebook loading error. Since I forked the original repo, I'm wondering if I need to modify the GitHub link to point to my forked repository. Even when I tried changing the link, the linting still failed.

This is my first contribution to the Gemini notebook files, so I'm eager to get it right. Could you provide some guidance on resolving the URL/loading issue?

Thanks again for your help and patience!

Mar 27 '25 02:03 Bhavesh2k4

@Bhavesh2k4 Don't worry about the link (especially since you need to move the notebook in the example folder which will change its URL). Worse case I'll fix it myself.

Mar 31 '25 09:03 Giom-V

Added text image multimodal bot - ISSUE 549

Added Gemini Multimodal (Text+Image) Tutorial - Issue #549

Overview

Features Covered

Technical Requirements

Why This Matters

Testing Done

Colab Link