instructor icon indicating copy to clipboard operation
instructor copied to clipboard

feat: gemini tool calling support

Open ssonal opened this issue 1 year ago • 16 comments

  • added GEMINI_TOOLS mode for data extraction through function calling
  • added compatibility for streaming, partials, iterables
  • updated tests
  • updated docs

:rocket: This description was created by Ellipsis for commit a6c95aae65f36fac9b8ccc283e275120bae7a247

Summary:

Added GEMINI_TOOLS mode to the instructor library for structured data extraction, updated documentation, tests, and added jsonref dependency.

Key points:

  • Introduced GEMINI_TOOLS mode for structured data extraction via function calling
  • Updated README.md and docs/concepts/patching.md for new mode
  • Modified Python modules to support GEMINI_TOOLS mode
  • Updated tests to include new mode
  • Added jsonref dependency for GEMINI_TOOLS mode
  • Ensured compatibility with streaming, partials, and iterables

Generated with :heart: by ellipsis.dev

ssonal avatar May 31 '24 18:05 ssonal

Is GEMINI_TOOLS mode similar to the new v1.5.3 that was released today with the response_schema specification?

tungalbert99 avatar May 31 '24 20:05 tungalbert99

fuck i think i merged vertex ai PR and fucked up this one, give me some time and i try to resolve it

jxnl avatar May 31 '24 22:05 jxnl

Hey any updates on this? We are getting a lot of json validation errors with gemini which is preventing us to go into production with it, so would absolutely love to get tools support. Would you be able to give a approx. date for this? @jxnl

gokturkDev avatar Jun 23 '24 18:06 gokturkDev

@ivanleomk can you take ownership of this? test this locally and make sure everything works?

jxnl avatar Jun 30 '24 18:06 jxnl

@jxnl yep I can do it.

ivanleomk avatar Jun 30 '24 23:06 ivanleomk

@ssonal I can't seem to run this code when I checkout your PR.

import instructor
import google.generativeai as genai
from pydantic import BaseModel, field_validator

client = instructor.from_gemini(
    genai.GenerativeModel(), mode=instructor.Mode.GEMINI_TOOLS
)


class UserExtractValidated(BaseModel):
    name: str
    age: int

    @field_validator("name")
    @classmethod
    def validate_name(cls, v: str) -> str:
        if v.upper() != v:
            raise ValueError(
                "Name should be uppercase, make sure to use the `uppercase` version of the name"
            )
        return v

model = client.chat.completions.create(
    response_model=UserExtractValidated,
    strict=False,
    messages=[
        {"role": "user", "content": "Extract jason is 25 years old"},
    ],
)

print(model.model_dump_json(indent=2))

I get the following error

  File "/Users/ivanleo/Documents/coding/instructor/test.py", line 28, in <module>
    model = client.chat.completions.create(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ivanleo/Documents/coding/instructor/instructor/client.py", line 91, in create
    return self.create_fn(
           ^^^^^^^^^^^^^^^
  File "/Users/ivanleo/Documents/coding/instructor/instructor/patch.py", line 140, in new_create_sync
    response_model, new_kwargs = handle_response_model(
                                 ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ivanleo/Documents/coding/instructor/instructor/process_response.py", line 443, in handle_response_model
    new_kwargs = update_gemini_kwargs(new_kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ivanleo/Documents/coding/instructor/instructor/utils.py", line 307, in update_gemini_kwargs
    val = kwargs["generation_config"].pop(k, None)
          ~~~~~~^^^^^^^^^^^^^^^^^^^^^
KeyError: 'generation_config'

Any idea what's up with this? Also, have you tried the native tool calling from gemini ( https://ai.google.dev/gemini-api/docs/function-calling ) , wondering how it stacks up against your specific implementation here?

ivanleomk avatar Jul 09 '24 14:07 ivanleomk

@ivanleomk my bad, looks like I missed a case. Updated & should work now.

ssonal avatar Jul 09 '24 14:07 ssonal

Not super familiar with Gemini, possible to show an example of how to setup GenerationConfig or how it's typically used?

Want to make sure it works in both cases.

ivanleomk avatar Jul 09 '24 14:07 ivanleomk

Not super familiar with Gemini, possible to show an example of how to setup GenerationConfig or how it's typically used?

Want to make sure it works in both cases.

Essentially allows the user to set parameters like temperature, etc. https://ai.google.dev/api/python/google/generativeai/types/GenerationConfig.

Here's an example -

class SinglePrediction(BaseModel):
        """
        Correct class label for the given text
        """

        class_label: Literal["spam", "not_spam"]

data = ("send us money", "spam")

client = instructor.from_gemini(
    genai.GenerativeModel(model_name="models/gemini-1.5-pro-latest"),
    mode=instructor.Mode.GEMINI_TOOLS,
)

input, expected = data
resp = client.chat.completions.create(
    response_model=SinglePrediction,
    strict=False,
    messages=[
        {
            "role": "user",
            "content": f"Classify the following text: {input}",
        },
    ],
    generation_config={
        "temperature": 0,
        "max_tokens": 200,
    },
)
assert resp.class_label == expecte

ssonal avatar Jul 09 '24 14:07 ssonal

Any idea what's up with this? Also, have you tried the native tool calling from gemini ( https://ai.google.dev/gemini-api/docs/function-calling ) , wondering how it stacks up against your specific implementation here?

@ivanleomk this is the native gemini tool calling implementation. See https://github.com/jxnl/instructor/blob/2a34d08314f902a765a21b95284404e4dc0d2636/instructor/process_response.py#L437-L443 Rest of this PR is config & response handling related.

ssonal avatar Jul 09 '24 17:07 ssonal

@ssonal how much would it take to implement function calling using this instead?

https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-gemini-using-openai-library

Wondering if it might be easier to ensure libs remain consistent especially since now we are doing this weird prompt injection in the PR

  message = dedent(
                f"""
                As a genius expert, your task is to understand the content and provide arguments to the functions provided. Make sure to provide the right function name and an openAPI compatible response!
                """
            )
            # check that the first message is a system message
            # if it is not, add a system message to the beginning
            if new_kwargs["messages"][0]["role"] != "system":
                new_kwargs["messages"].insert(
                    0,
                    {
                        "role": "system",
                        "content": message,
                    },
                )

ivanleomk avatar Jul 13 '24 14:07 ivanleomk

@ssonal how much would it take to implement function calling using this instead?

https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-gemini-using-openai-library

Wondering if it might be easier to ensure libs remain consistent especially since now we are doing this weird prompt injection in the PR

  message = dedent(
                f"""
                As a genius expert, your task is to understand the content and provide arguments to the functions provided. Make sure to provide the right function name and an openAPI compatible response!
                """
            )
            # check that the first message is a system message
            # if it is not, add a system message to the beginning
            if new_kwargs["messages"][0]["role"] != "system":
                new_kwargs["messages"].insert(
                    0,
                    {
                        "role": "system",
                        "content": message,
                    },
                )

@ivanleomk unfortunately openai library compatibility is only for Google VertexAI, and not for the Google Cloud AI platform as of today. With regards to the system prompt, I didn't want to overwrite system prompts that the user may have included but maybe this snippet could be more robust. Should I approach this differently?

ssonal avatar Jul 13 '24 17:07 ssonal

@ssonal Hmm have you tried not including a system prompt itself? I think that might not be a bad experiment to try.

Also for some reason when I run this snippet

import instructor
import google.generativeai as genai
from pydantic import BaseModel, field_validator
import logfire

logfire.configure()

client = instructor.from_gemini(
    genai.GenerativeModel(), mode=instructor.Mode.GEMINI_TOOLS
)


class UserExtractValidated(BaseModel):
    name: str
    age: int

    @field_validator("name")
    @classmethod
    def validate_name(cls, v: str) -> str:
        if v.upper() != v:
            raise ValueError(
                "Name should be uppercase, make sure to use the `uppercase` version of the name"
            )
        return v


model = client.chat.completions.create(
    response_model=UserExtractValidated,
    strict=False,
    messages=[
        {"role": "user", "content": "Extract jason is 25 years old"},
    ],
)

print(model.model_dump_json(indent=2))

I get the error that the Tool name does not match. Gemini seems to be returning a function call called run for some reason

ivanleomk avatar Jul 14 '24 03:07 ivanleomk

@ssonal Hmm have you tried not including a system prompt itself? I think that might not be a bad experiment to try.

@ivanleomk Sure, let me make this change and report back.

I get the error that the Tool name does not match. Gemini seems to be returning a function call called run for some reason

Try with the 1.5-pro model. 1.5-flash struggles with validations and retries. Also the default model is gemini-pro which is also much further behind in terms of perf.

client = instructor.from_gemini(
    client=genai.GenerativeModel(
        model_name="models/gemini-1.5-pro",
    ),
    mode=instructor.Mode.GEMINI_TOOLS,
)

ssonal avatar Jul 14 '24 05:07 ssonal

@ssonal Hmm have you tried not including a system prompt itself? I think that might not be a bad experiment to try.

@ivanleomk updated with this change. All good - tests pass.

ssonal avatar Jul 17 '24 17:07 ssonal

What's left for this PR to be merged? I see that last activity was 3 weeks ago

krish-bell avatar Aug 06 '24 18:08 krish-bell

this llkely needs a rebase and some more tests

jxnl avatar Aug 25 '24 22:08 jxnl

Will look at this PR later in the week and update the code.

ivanleomk avatar Aug 26 '24 02:08 ivanleomk

@ssonal do you know of a way to extract the value of a int from a protobuf without it becoming a float? I'm a bit worried about turning off the strict parsing for all our tests because of this one change.

Also, in terms of the tool_config, I think it makes sense to force a function call by passing in the allowed_function_names parameter

new_kwargs["tool_config"] = {
                "function_calling_config": {
                    "mode": "ANY",
                    "allowed_function_names": [response_model.__name__],
                },
            }

ivanleomk avatar Aug 27 '24 07:08 ivanleomk

I ran tests on the new GEMINI_TOOLS mode with pro and flash locally and this is the result

tests/llm/test_gemini/test_modes.py ........                                                    [  7%]
tests/llm/test_gemini/test_patch.py ........                                                    [ 14%]
tests/llm/test_gemini/test_retries.py ........                                                  [ 22%]
tests/llm/test_gemini/test_simple_types.py ...                                                  [ 25%]
tests/llm/test_gemini/test_stream.py ............                                               [ 36%]
tests/llm/test_gemini/evals/test_classification_enums.py ....................                   [ 55%]
tests/llm/test_gemini/evals/test_classification_literals.py ....................                [ 73%]
tests/llm/test_gemini/evals/test_entities.py ....                                               [ 77%]
tests/llm/test_gemini/evals/test_extract_users.py ............                                  [ 88%]
tests/llm/test_gemini/evals/test_sentiment_analysis.py ............                             [100%]

ivanleomk avatar Aug 31 '24 04:08 ivanleomk

Ran tests once more

VertexAI

tests/llm/test_vertexai/test_message_parser.py ....                                             [ 16%]
tests/llm/test_vertexai/test_modes.py ......                                                    [ 41%]
tests/llm/test_vertexai/test_retries.py ....                                                    [ 58%]
tests/llm/test_vertexai/test_simple_types.py ......                                             [ 83%]
tests/llm/test_vertexai/test_stream.py ....                                                     [100%]

Gemini

tests/llm/test_gemini/test_modes.py ........                                                    [  7%]
tests/llm/test_gemini/test_patch.py ........                                                    [ 14%]
tests/llm/test_gemini/test_retries.py ........                                                  [ 22%]
tests/llm/test_gemini/test_simple_types.py ...                                                  [ 25%]
tests/llm/test_gemini/test_stream.py ............                                               [ 36%]
tests/llm/test_gemini/evals/test_classification_enums.py ....................                   [ 55%]
tests/llm/test_gemini/evals/test_classification_literals.py ....................                [ 73%]
tests/llm/test_gemini/evals/test_entities.py ....                                               [ 77%]
tests/llm/test_gemini/evals/test_extract_users.py ............                                  [ 88%]
tests/llm/test_gemini/evals/test_sentiment_analysis.py ............                             [100%]

ivanleomk avatar Aug 31 '24 06:08 ivanleomk