feat: gemini tool calling support
- added
GEMINI_TOOLSmode for data extraction through function calling - added compatibility for streaming, partials, iterables
- updated tests
- updated docs
| :rocket: | This description was created by Ellipsis for commit a6c95aae65f36fac9b8ccc283e275120bae7a247 |
|---|
Summary:
Added GEMINI_TOOLS mode to the instructor library for structured data extraction, updated documentation, tests, and added jsonref dependency.
Key points:
- Introduced
GEMINI_TOOLSmode for structured data extraction via function calling - Updated
README.mdanddocs/concepts/patching.mdfor new mode - Modified Python modules to support
GEMINI_TOOLSmode - Updated tests to include new mode
- Added
jsonrefdependency forGEMINI_TOOLSmode - Ensured compatibility with streaming, partials, and iterables
Generated with :heart: by ellipsis.dev
Is GEMINI_TOOLS mode similar to the new v1.5.3 that was released today with the response_schema specification?
fuck i think i merged vertex ai PR and fucked up this one, give me some time and i try to resolve it
Hey any updates on this? We are getting a lot of json validation errors with gemini which is preventing us to go into production with it, so would absolutely love to get tools support. Would you be able to give a approx. date for this? @jxnl
@ivanleomk can you take ownership of this? test this locally and make sure everything works?
@jxnl yep I can do it.
@ssonal I can't seem to run this code when I checkout your PR.
import instructor
import google.generativeai as genai
from pydantic import BaseModel, field_validator
client = instructor.from_gemini(
genai.GenerativeModel(), mode=instructor.Mode.GEMINI_TOOLS
)
class UserExtractValidated(BaseModel):
name: str
age: int
@field_validator("name")
@classmethod
def validate_name(cls, v: str) -> str:
if v.upper() != v:
raise ValueError(
"Name should be uppercase, make sure to use the `uppercase` version of the name"
)
return v
model = client.chat.completions.create(
response_model=UserExtractValidated,
strict=False,
messages=[
{"role": "user", "content": "Extract jason is 25 years old"},
],
)
print(model.model_dump_json(indent=2))
I get the following error
File "/Users/ivanleo/Documents/coding/instructor/test.py", line 28, in <module>
model = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/ivanleo/Documents/coding/instructor/instructor/client.py", line 91, in create
return self.create_fn(
^^^^^^^^^^^^^^^
File "/Users/ivanleo/Documents/coding/instructor/instructor/patch.py", line 140, in new_create_sync
response_model, new_kwargs = handle_response_model(
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/ivanleo/Documents/coding/instructor/instructor/process_response.py", line 443, in handle_response_model
new_kwargs = update_gemini_kwargs(new_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/ivanleo/Documents/coding/instructor/instructor/utils.py", line 307, in update_gemini_kwargs
val = kwargs["generation_config"].pop(k, None)
~~~~~~^^^^^^^^^^^^^^^^^^^^^
KeyError: 'generation_config'
Any idea what's up with this? Also, have you tried the native tool calling from gemini ( https://ai.google.dev/gemini-api/docs/function-calling ) , wondering how it stacks up against your specific implementation here?
@ivanleomk my bad, looks like I missed a case. Updated & should work now.
Not super familiar with Gemini, possible to show an example of how to setup GenerationConfig or how it's typically used?
Want to make sure it works in both cases.
Not super familiar with Gemini, possible to show an example of how to setup GenerationConfig or how it's typically used?
Want to make sure it works in both cases.
Essentially allows the user to set parameters like temperature, etc. https://ai.google.dev/api/python/google/generativeai/types/GenerationConfig.
Here's an example -
class SinglePrediction(BaseModel):
"""
Correct class label for the given text
"""
class_label: Literal["spam", "not_spam"]
data = ("send us money", "spam")
client = instructor.from_gemini(
genai.GenerativeModel(model_name="models/gemini-1.5-pro-latest"),
mode=instructor.Mode.GEMINI_TOOLS,
)
input, expected = data
resp = client.chat.completions.create(
response_model=SinglePrediction,
strict=False,
messages=[
{
"role": "user",
"content": f"Classify the following text: {input}",
},
],
generation_config={
"temperature": 0,
"max_tokens": 200,
},
)
assert resp.class_label == expecte
Any idea what's up with this? Also, have you tried the native tool calling from gemini ( https://ai.google.dev/gemini-api/docs/function-calling ) , wondering how it stacks up against your specific implementation here?
@ivanleomk this is the native gemini tool calling implementation. See https://github.com/jxnl/instructor/blob/2a34d08314f902a765a21b95284404e4dc0d2636/instructor/process_response.py#L437-L443 Rest of this PR is config & response handling related.
@ssonal how much would it take to implement function calling using this instead?
https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-gemini-using-openai-library
Wondering if it might be easier to ensure libs remain consistent especially since now we are doing this weird prompt injection in the PR
message = dedent(
f"""
As a genius expert, your task is to understand the content and provide arguments to the functions provided. Make sure to provide the right function name and an openAPI compatible response!
"""
)
# check that the first message is a system message
# if it is not, add a system message to the beginning
if new_kwargs["messages"][0]["role"] != "system":
new_kwargs["messages"].insert(
0,
{
"role": "system",
"content": message,
},
)
@ssonal how much would it take to implement function calling using this instead?
https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-gemini-using-openai-library
Wondering if it might be easier to ensure libs remain consistent especially since now we are doing this weird prompt injection in the PR
message = dedent( f""" As a genius expert, your task is to understand the content and provide arguments to the functions provided. Make sure to provide the right function name and an openAPI compatible response! """ ) # check that the first message is a system message # if it is not, add a system message to the beginning if new_kwargs["messages"][0]["role"] != "system": new_kwargs["messages"].insert( 0, { "role": "system", "content": message, }, )
@ivanleomk unfortunately openai library compatibility is only for Google VertexAI, and not for the Google Cloud AI platform as of today. With regards to the system prompt, I didn't want to overwrite system prompts that the user may have included but maybe this snippet could be more robust. Should I approach this differently?
@ssonal Hmm have you tried not including a system prompt itself? I think that might not be a bad experiment to try.
Also for some reason when I run this snippet
import instructor
import google.generativeai as genai
from pydantic import BaseModel, field_validator
import logfire
logfire.configure()
client = instructor.from_gemini(
genai.GenerativeModel(), mode=instructor.Mode.GEMINI_TOOLS
)
class UserExtractValidated(BaseModel):
name: str
age: int
@field_validator("name")
@classmethod
def validate_name(cls, v: str) -> str:
if v.upper() != v:
raise ValueError(
"Name should be uppercase, make sure to use the `uppercase` version of the name"
)
return v
model = client.chat.completions.create(
response_model=UserExtractValidated,
strict=False,
messages=[
{"role": "user", "content": "Extract jason is 25 years old"},
],
)
print(model.model_dump_json(indent=2))
I get the error that the Tool name does not match. Gemini seems to be returning a function call called run for some reason
@ssonal Hmm have you tried not including a system prompt itself? I think that might not be a bad experiment to try.
@ivanleomk Sure, let me make this change and report back.
I get the error that the Tool name does not match. Gemini seems to be returning a function call called
runfor some reason
Try with the 1.5-pro model. 1.5-flash struggles with validations and retries. Also the default model is gemini-pro which is also much further behind in terms of perf.
client = instructor.from_gemini(
client=genai.GenerativeModel(
model_name="models/gemini-1.5-pro",
),
mode=instructor.Mode.GEMINI_TOOLS,
)
@ssonal Hmm have you tried not including a system prompt itself? I think that might not be a bad experiment to try.
@ivanleomk updated with this change. All good - tests pass.
What's left for this PR to be merged? I see that last activity was 3 weeks ago
this llkely needs a rebase and some more tests
Will look at this PR later in the week and update the code.
@ssonal do you know of a way to extract the value of a int from a protobuf without it becoming a float? I'm a bit worried about turning off the strict parsing for all our tests because of this one change.
Also, in terms of the tool_config, I think it makes sense to force a function call by passing in the allowed_function_names parameter
new_kwargs["tool_config"] = {
"function_calling_config": {
"mode": "ANY",
"allowed_function_names": [response_model.__name__],
},
}
I ran tests on the new GEMINI_TOOLS mode with pro and flash locally and this is the result
tests/llm/test_gemini/test_modes.py ........ [ 7%]
tests/llm/test_gemini/test_patch.py ........ [ 14%]
tests/llm/test_gemini/test_retries.py ........ [ 22%]
tests/llm/test_gemini/test_simple_types.py ... [ 25%]
tests/llm/test_gemini/test_stream.py ............ [ 36%]
tests/llm/test_gemini/evals/test_classification_enums.py .................... [ 55%]
tests/llm/test_gemini/evals/test_classification_literals.py .................... [ 73%]
tests/llm/test_gemini/evals/test_entities.py .... [ 77%]
tests/llm/test_gemini/evals/test_extract_users.py ............ [ 88%]
tests/llm/test_gemini/evals/test_sentiment_analysis.py ............ [100%]
Ran tests once more
VertexAI
tests/llm/test_vertexai/test_message_parser.py .... [ 16%]
tests/llm/test_vertexai/test_modes.py ...... [ 41%]
tests/llm/test_vertexai/test_retries.py .... [ 58%]
tests/llm/test_vertexai/test_simple_types.py ...... [ 83%]
tests/llm/test_vertexai/test_stream.py .... [100%]
Gemini
tests/llm/test_gemini/test_modes.py ........ [ 7%]
tests/llm/test_gemini/test_patch.py ........ [ 14%]
tests/llm/test_gemini/test_retries.py ........ [ 22%]
tests/llm/test_gemini/test_simple_types.py ... [ 25%]
tests/llm/test_gemini/test_stream.py ............ [ 36%]
tests/llm/test_gemini/evals/test_classification_enums.py .................... [ 55%]
tests/llm/test_gemini/evals/test_classification_literals.py .................... [ 73%]
tests/llm/test_gemini/evals/test_entities.py .... [ 77%]
tests/llm/test_gemini/evals/test_extract_users.py ............ [ 88%]
tests/llm/test_gemini/evals/test_sentiment_analysis.py ............ [100%]