generative-ai-python icon indicating copy to clipboard operation
generative-ai-python copied to clipboard

response_schema parameter is not followed unless system_instruction also details the response_schema for gemini-1.5-pro family of models. Is this intended behavior?

Open Bikatr7 opened this issue 9 months ago • 0 comments

Description of the bug:

response_schema parameter is not followed unless system_instruction also details the response_schema for gemini-1.5-pro family of models.

This could be intended behavior, but it seems like it could be a massive waste of tokens for more complicated schemas.

Actual vs expected behavior:

I'd expect the response_schema to be respected and followed regardless of system_instructions detailing them. If this is an incorrect assumption please let me know.

Any other information you'd like to share?

code to reproduce

## built-in libraries
import typing

## third party libraries
from google.generativeai import GenerationConfig
from google.generativeai.types import GenerateContentResponse, AsyncGenerateContentResponse
import google.generativeai as genai

## Dummy values from production code
_default_translation_instructions: str = "Translate this to German. Format the response as JSON parseable string."
_default_model: str = "gemini-1.5-pro-latest"

_system_message = _default_translation_instructions

_model: str = _default_model
_temperature: float = 0.5
_top_p: float = 0.9
_top_k: int = 40
_candidate_count: int = 1
_stream: bool = False
_stop_sequences: typing.List[str] | None = None
_max_output_tokens: int | None = None

_client: genai.GenerativeModel
_generation_config: GenerationConfig

_decorator_to_use: typing.Union[typing.Callable, None] = None

_safety_settings = [
    {
        "category": "HARM_CATEGORY_DANGEROUS",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_HARASSMENT",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "threshold": "BLOCK_NONE",
    },
]

## with open("gemini.txt", "r", encoding="utf-8") as f:
##      api_key = f.read().strip()
api_key = "YOUR_API_KEY"
genai.configure(api_key=api_key)

## Instructing the model to translate the input to German as JSON, without detailed schema
non_specific_client = genai.GenerativeModel(
    model_name=_model,
    safety_settings=_safety_settings,
    system_instruction="Translate this to German. Format the response as JSON parseable string."
)

## Instructing the model to translate the input to German as JSON, with detailed schema
_client = genai.GenerativeModel(
    model_name=_model,
    safety_settings=_safety_settings,
    system_instruction="Translate this to German. Format the response as JSON parseable string. It must have 2 keys, one for input titled input, and one called output, which is the translation."
)

_generation_config = GenerationConfig(
    candidate_count=_candidate_count,
    stop_sequences=_stop_sequences,
    max_output_tokens=_max_output_tokens,
    temperature=_temperature,
    top_p=_top_p,
    top_k=_top_k,
    response_mime_type="application/json",
    response_schema={
        "type": "object",
        "properties": {
            "input": {
                "type": "string",
                "description": "The original text that was translated."
            },
            "output": {
                "type": "string",
                "description": "The translated text."
            }
        },
        "required": ["input", "output"],
    }
)

## Inconsistent results, schema is not being followed
try:
    response = non_specific_client.generate_content(
        "Hello, world!", generation_config=_generation_config
    )
    print(response.text)
except Exception as e:
    print(f"Error with non-specific client: {e}")

## Consistent results, schema is being followed
try:
    response = _client.generate_content(
        "Hello, world!", generation_config=_generation_config
    )
    print(response.text)
except Exception as e:
    print(f"Error with specific client: {e}")

## Clarification question
## Is it intended behavior that the system instruction has to detail the schema? If so, what's the point of the response_schema parameter in the GenerationConfig class? It seems like a waste of tokens.

Bikatr7 avatar May 15 '24 04:05 Bikatr7