litellm
litellm copied to clipboard
[Bug]: Log gemini system message to langfuse
What happened?
LiteLLM proxy server seems to not pass the system message when making /chat/completion
for gemini-1.5-pro-latest
(or any gemini model?)
I found this bug when checking through the langfuse logging.
Postman:
Langfuse:
Here is the test JSON I used:
test JSON
{
"model": "gemini-1.5-pro-latest",
"messages": [
{
"role": "system",
"content": "You are provided a scanned copy of my resume (or portion of it), can you check if there is any grammar or vocabulary issue? \n\nMake sure to keep the same length for each sentences.\n\nMake sure to explain what you have modified in bullet points."
},
{
"role": "user",
"content": "Easy Cucumber\t\t\t\t\t\t\t\t May 2022 – Apr 2023\n\tDeveloped a BDD testing library in Java, support executing Gherkin Language in any testing environments.\no\tRedesigned a cleaner architecture to provide better performance than the official Cucumber library.\n\tConducted research and implemented support for concurrent execution of multiple scenarios, to support testing complex scenarios that Cucumber could not handle."
},
{
"role": "assistant",
"content": "Here's the revised version with corrections and explanations:\n\nEasy Cucumber May 2022 – Apr 2023\n• Developed a Behavior-Driven Development (BDD) testing library in Java, supporting the execution of Gherkin language in various testing environments.\n• Redesigned a cleaner architecture to provide better performance compared to the official Cucumber library.\n• Conducted research and implemented support for concurrent execution of multiple scenarios, enabling testing of complex scenarios that Cucumber could not handle.\n\nModifications:\n- Replaced \"a BDD testing library\" with \"a Behavior-Driven Development (BDD) testing library\" for clarity.\n- Changed \"support executing Gherkin Language\" to \"supporting the execution of Gherkin language\" for better grammar.\n- Replaced \"any testing environments\" with \"various testing environments\" for better phrasing.\n- Changed \"than the official Cucumber library\" to \"compared to the official Cucumber library\" for better grammar.\n- Replaced \"to support testing complex scenarios that Cucumber could not handle\" with \"enabling testing of complex scenarios that Cucumber could not handle\" for better phrasing and grammar."
},
{
"role": "user",
"content": "I end up with following, is it fine?\n\nEasy Cucumber\t\t\t\t\t\t\t\t May 2022 – Apr 2023\n\tDeveloped a BDD test library in Java, support executing Gherkin Language in various test frameworks.\no\tRedesigned a cleaner architecture to provide better performance compared to the Cucumber library.\n\tConducted research and implemented support for concurrent execution of multiple scenarios, enabling testing of complex scenarios that Cucumber could not handle."
}
]
}
The corresponding LiteLLM proxy yaml file:
model_list:
- model_name: gemini-1.5-pro-latest
litellm_params:
model: gemini/gemini-1.5-pro-latest
api_key: os.environ/GEMINI_API_KEY
Relevant log output
No response
Twitter / LinkedIn details
@CXwudi / https://www.linkedin.com/in/charles-chen-cc98/
Hi @CXwudi i believe the call is working as expected.
gemini accepts the system prompt as a separate arg - https://github.com/BerriAI/litellm/blob/6e934cb842f762830949312bc37760eb2d950b9e/litellm/llms/gemini.py#L146
You should be able to confirm this when running the proxy with --detailed_debug
and seeing the request we make.
since it's passed separately, i believe it's missed from the langfuse logging object. Will add it.
Hi @krrishdholakia,
I just tested through both postman and Google AI Studio and I am pretty sure the system message is missing.
This time I am using a simple test JSON:
{
"model": "gemini-1.5-pro-latest",
"messages": [
{
"role": "system",
"content": "If you are asked about who is your best girl, answer \"Hatsune Miku\" please."
},
{
"role": "user",
"content": "What is your best girl?"
}
]
}
Test it on Google AI Studio gives the correct result:
However, the same JSON sent from postman returns:
{
"id": "chatcmpl-2bb1ff75-b5e2-4e7a-be97-9a19ed814f90",
"choices": [
{
"finish_reason": "stop",
"index": 1,
"message": {
"content": "As an AI language model, I am not capable of having personal opinions or beliefs. Therefore, I do not have a \"best girl\" or any preferences of that nature. \n\nIs there anything else I can assist you with? \n",
"role": "assistant"
}
}
],
"created": 1712887478,
"model": "gemini/gemini-1.5-pro-latest",
"object": "chat.completion",
"system_fingerprint": null,
"usage": {
"prompt_tokens": 6,
"completion_tokens": 47,
"total_tokens": 53
}
}
--detailed_debug
from postman
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache key: 597b81f17c45e0b3e24b9f7a8edf4a55795147cc69aead77174c71f267857bb9; local_only: False
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache result: token='597b81f17c45e0b3e24b9f7a8edf4a55795147cc69aead77174c71f267857bb9' key_name=None key_alias=None spend=0.0009649000000000001 max_budget=None expires=None models=[] aliases={} config={} user_id='default_user_id' team_id=None max_parallel_requests=None metadata={} tpm_limit=None rpm_limit=None budget_duration=None budget_reset_at=None allowed_cache_controls=[] permissions={} model_spend={} model_max_budget={} soft_budget_cooldown=False litellm_budget_table=None user_id_rate_limits=None team_id_rate_limits=None team_spend=None team_tpm_limit=None team_rpm_limit=None team_max_budget=None team_models=[] team_blocked=False soft_budget=None team_model_aliases=None api_key='sk-ultimate-mikuchat' user_role='proxy_admin'
2024-04-11 22:04:35 02:04:35 - LiteLLM Proxy:DEBUG: proxy_server.py:3394 - Request Headers: Headers({'accept': 'application/json', 'content-type': 'application/json', 'authorization': 'Bearer sk-ultimate-mikuchat', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'})
2024-04-11 22:04:35 02:04:35 - LiteLLM Proxy:DEBUG: proxy_server.py:3400 - receiving data: {'model': 'gemini-1.5-pro-latest', 'messages': [{'role': 'system', 'content': 'If you are asked about who is your best girl, answer "Hatsune Miku" please.'}, {'role': 'user', 'content': 'What is your best girl?'}], 'proxy_server_request': {'url': 'http://localhost:6001/v1/chat/completions', 'method': 'POST', 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'authorization': 'Bearer sk-ultimate-mikuchat', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'body': {'model': 'gemini-1.5-pro-latest', 'messages': [{'role': 'system', 'content': 'If you are asked about who is your best girl, answer "Hatsune Miku" please.'}, {'role': 'user', 'content': 'What is your best girl?'}]}}, 'ttl': None}
2024-04-11 22:04:35 02:04:35 - LiteLLM Proxy:DEBUG: utils.py:36 - Inside Proxy Logging Pre-call hook!
2024-04-11 22:04:35 02:04:35 - LiteLLM Proxy:DEBUG: parallel_request_limiter.py:21 - Inside Max Parallel Request Pre-Call Hook
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache key: sk-ultimate-mikuchat::2024-04-12-02-04::request_count; local_only: False
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache result: None
2024-04-11 22:04:35 02:04:35 - LiteLLM Proxy:DEBUG: parallel_request_limiter.py:21 - current: None
2024-04-11 22:04:35 02:04:35 - LiteLLM Proxy:DEBUG: tpm_rpm_limiter.py:33 - Inside Max TPM/RPM Limiter Pre-Call Hook - token='597b81f17c45e0b3e24b9f7a8edf4a55795147cc69aead77174c71f267857bb9' key_name=None key_alias=None spend=0.0009649000000000001 max_budget=None expires=None models=[] aliases={} config={} user_id='default_user_id' team_id=None max_parallel_requests=None metadata={} tpm_limit=None rpm_limit=None budget_duration=None budget_reset_at=None allowed_cache_controls=[] permissions={} model_spend={} model_max_budget={} soft_budget_cooldown=False litellm_budget_table=None user_id_rate_limits=None team_id_rate_limits=None team_spend=None team_tpm_limit=None team_rpm_limit=None team_max_budget=None team_models=[] team_blocked=False soft_budget=None team_model_aliases=None api_key='sk-ultimate-mikuchat' user_role='proxy_admin'
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache key: sk-ultimate-mikuchat; local_only: False
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache result: None
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache key: default_user_id; local_only: False
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache result: {'user_id': 'default_user_id', 'max_budget': None, 'spend': 0.0009649000000000001, 'model_max_budget': {}, 'model_spend': {}, 'user_email': None, 'models': [], 'tpm_limit': None, 'rpm_limit': None}
2024-04-11 22:04:35 02:04:35 - LiteLLM Proxy:DEBUG: tpm_rpm_limiter.py:33 - _set_limits: False
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache key: default_user_id_user_api_key_user_id; local_only: False
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache result: None
2024-04-11 22:04:35 02:04:35 - LiteLLM Proxy:DEBUG: utils.py:36 - final data being sent to completion call: {'model': 'gemini-1.5-pro-latest', 'messages': [{'role': 'system', 'content': 'If you are asked about who is your best girl, answer "Hatsune Miku" please.'}, {'role': 'user', 'content': 'What is your best girl?'}], 'proxy_server_request': {'url': 'http://localhost:6001/v1/chat/completions', 'method': 'POST', 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'authorization': 'Bearer sk-ultimate-mikuchat', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'body': {'model': 'gemini-1.5-pro-latest', 'messages': [{'role': 'system', 'content': 'If you are asked about who is your best girl, answer "Hatsune Miku" please.'}, {'role': 'user', 'content': 'What is your best girl?'}]}}, 'ttl': None, 'user': 'default_user_id', 'metadata': {'user_api_key': 'sk-ultimate-mikuchat', 'user_api_key_alias': None, 'user_api_key_user_id': 'default_user_id', 'user_api_key_team_id': None, 'user_api_key_metadata': {}, 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'endpoint': 'http://localhost:6001/v1/chat/completions'}, 'request_timeout': 600}
2024-04-11 22:04:35 02:04:35 - LiteLLM Router:DEBUG: router.py:1232 - Inside async function with retries: args - (); kwargs - {'proxy_server_request': {'url': 'http://localhost:6001/v1/chat/completions', 'method': 'POST', 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'authorization': 'Bearer sk-ultimate-mikuchat', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'body': {'model': 'gemini-1.5-pro-latest', 'messages': [{'role': 'system', 'content': 'If you are asked about who is your best girl, answer "Hatsune Miku" please.'}, {'role': 'user', 'content': 'What is your best girl?'}]}}, 'ttl': None, 'user': 'default_user_id', 'metadata': {'user_api_key': 'sk-ultimate-mikuchat', 'user_api_key_alias': None, 'user_api_key_user_id': 'default_user_id', 'user_api_key_team_id': None, 'user_api_key_metadata': {}, 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'endpoint': 'http://localhost:6001/v1/chat/completions', 'model_group': 'gemini-1.5-pro-latest'}, 'request_timeout': 600, 'model': 'gemini-1.5-pro-latest', 'messages': [{'role': 'system', 'content': 'If you are asked about who is your best girl, answer "Hatsune Miku" please.'}, {'role': 'user', 'content': 'What is your best girl?'}], 'original_function': <bound method Router._acompletion of <litellm.router.Router object at 0x7f513b511d90>>, 'num_retries': 0}
2024-04-11 22:04:35 02:04:35 - LiteLLM Router:DEBUG: router.py:1240 - async function w/ retries: original_function - <bound method Router._acompletion of <litellm.router.Router object at 0x7f513b511d90>>
2024-04-11 22:04:35 02:04:35 - LiteLLM Router:DEBUG: router.py:414 - Inside _acompletion()- model: gemini-1.5-pro-latest; kwargs: {'proxy_server_request': {'url': 'http://localhost:6001/v1/chat/completions', 'method': 'POST', 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'authorization': 'Bearer sk-ultimate-mikuchat', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'body': {'model': 'gemini-1.5-pro-latest', 'messages': [{'role': 'system', 'content': 'If you are asked about who is your best girl, answer "Hatsune Miku" please.'}, {'role': 'user', 'content': 'What is your best girl?'}]}}, 'ttl': None, 'user': 'default_user_id', 'metadata': {'user_api_key': 'sk-ultimate-mikuchat', 'user_api_key_alias': None, 'user_api_key_user_id': 'default_user_id', 'user_api_key_team_id': None, 'user_api_key_metadata': {}, 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'endpoint': 'http://localhost:6001/v1/chat/completions', 'model_group': 'gemini-1.5-pro-latest'}, 'request_timeout': 600}
2024-04-11 22:04:35 02:04:35 - LiteLLM Router:DEBUG: router.py:2475 - initial list of deployments: [{'model_name': 'gemini-1.5-pro-latest', 'litellm_params': {'model': 'gemini/gemini-1.5-pro-latest', 'api_key': '', 'max_retries': 2}, 'model_info': {'id': '0cdd3aba7daa6215828fe92b268271942f92828eacd1378d53793435f1eddc90', 'description': 'gemini-1.5-pro-latest from Google Gemini Official. Mid-size multimodal model that supports up to 1 million tokens', 'max_tokens': 1048576}}]
2024-04-11 22:04:35 02:04:35 - LiteLLM Router:DEBUG: router.py:2479 - healthy deployments: length 1 [{'model_name': 'gemini-1.5-pro-latest', 'litellm_params': {'model': 'gemini/gemini-1.5-pro-latest', 'api_key': '', 'max_retries': 2}, 'model_info': {'id': '0cdd3aba7daa6215828fe92b268271942f92828eacd1378d53793435f1eddc90', 'description': 'gemini-1.5-pro-latest from Google Gemini Official. Mid-size multimodal model that supports up to 1 million tokens', 'max_tokens': 1048576}}]
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache key: 02-04:cooldown_models; local_only: False
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache result: None
2024-04-11 22:04:35 02:04:35 - LiteLLM Router:DEBUG: router.py:1678 - retrieve cooldown models: []
2024-04-11 22:04:35 02:04:35 - LiteLLM Router:DEBUG: router.py:2595 - cooldown deployments: []
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache key: 0cdd3aba7daa6215828fe92b268271942f92828eacd1378d53793435f1eddc90; local_only: True
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache result: None
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - set cache: key: 0cdd3aba7daa6215828fe92b268271942f92828eacd1378d53793435f1eddc90; value: 1
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - InMemoryCache: set_cache
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache key: 0cdd3aba7daa6215828fe92b268271942f92828eacd1378d53793435f1eddc90_async_client; local_only: True
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache result: None
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache key: 0cdd3aba7daa6215828fe92b268271942f92828eacd1378d53793435f1eddc90_async_client; local_only: True
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: caching.py:21 - get cache: cache result: None
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: utils.py:936 -
2024-04-11 22:04:35
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: utils.py:936 - Request to litellm:
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: utils.py:936 - litellm.acompletion(model='gemini/gemini-1.5-pro-latest', api_key='', max_retries=0, messages=[{'role': 'system', 'content': 'If you are asked about who is your best girl, answer "Hatsune Miku" please.'}, {'role': 'user', 'content': 'What is your best girl?'}], caching=False, client=None, timeout=6000, proxy_server_request={'url': 'http://localhost:6001/v1/chat/completions', 'method': 'POST', 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'authorization': 'Bearer sk-ultimate-mikuchat', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'body': {'model': 'gemini-1.5-pro-latest', 'messages': [{'role': 'system', 'content': 'If you are asked about who is your best girl, answer "Hatsune Miku" please.'}, {'role': 'user', 'content': 'What is your best girl?'}]}}, ttl=None, user='default_user_id', metadata={'user_api_key': 'sk-ultimate-mikuchat', 'user_api_key_alias': None, 'user_api_key_user_id': 'default_user_id', 'user_api_key_team_id': None, 'user_api_key_metadata': {}, 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'endpoint': 'http://localhost:6001/v1/chat/completions', 'model_group': 'gemini-1.5-pro-latest', 'deployment': 'gemini/gemini-1.5-pro-latest', 'model_info': {'id': '0cdd3aba7daa6215828fe92b268271942f92828eacd1378d53793435f1eddc90', 'description': 'gemini-1.5-pro-latest from Google Gemini Official. Mid-size multimodal model that supports up to 1 million tokens', 'max_tokens': 1048576}, 'caching_groups': None}, request_timeout=600, model_info={'id': '0cdd3aba7daa6215828fe92b268271942f92828eacd1378d53793435f1eddc90', 'description': 'gemini-1.5-pro-latest from Google Gemini Official. Mid-size multimodal model that supports up to 1 million tokens', 'max_tokens': 1048576})
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: utils.py:936 -
2024-04-11 22:04:35
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: utils.py:936 - Initialized litellm callbacks, Async Success Callbacks: [<litellm.proxy.hooks.parallel_request_limiter._PROXY_MaxParallelRequestsHandler object at 0x7f513d613690>, <litellm.proxy.hooks.tpm_rpm_limiter._PROXY_MaxTPMRPMLimiter object at 0x7f513d613710>, <litellm.proxy.hooks.max_budget_limiter._PROXY_MaxBudgetLimiter object at 0x7f513f4481d0>, <litellm.proxy.hooks.cache_control_check._PROXY_CacheControlCheck object at 0x7f513d613750>, <function _PROXY_track_cost_callback at 0x7f513d671f80>, <bound method ProxyLogging.response_taking_too_long_callback of <litellm.proxy.utils.ProxyLogging object at 0x7f513eda2f10>>]
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: utils.py:936 - self.optional_params: {}
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: utils.py:936 - litellm.cache: None
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: utils.py:936 - kwargs[caching]: False; litellm.cache: None
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: utils.py:4543 -
2024-04-11 22:04:35 LiteLLM completion() model= gemini-1.5-pro-latest; provider = gemini
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: utils.py:4546 -
2024-04-11 22:04:35 LiteLLM: Params passed to completion() {'functions': None, 'function_call': None, 'temperature': None, 'top_p': None, 'n': None, 'stream': None, 'stop': None, 'max_tokens': None, 'presence_penalty': None, 'frequency_penalty': None, 'logit_bias': None, 'user': 'default_user_id', 'model': 'gemini-1.5-pro-latest', 'custom_llm_provider': 'gemini', 'response_format': None, 'seed': None, 'tools': None, 'tool_choice': None, 'max_retries': 0, 'logprobs': None, 'top_logprobs': None, 'extra_headers': None}
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: utils.py:4549 -
2024-04-11 22:04:35 LiteLLM: Non-Default params passed to completion() {'user': 'default_user_id', 'max_retries': 0}
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: utils.py:936 - Final returned optional params: {}
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: utils.py:936 - self.optional_params: {}
2024-04-11 22:04:35 02:04:35 - LiteLLM:DEBUG: utils.py:1091 - PRE-API-CALL ADDITIONAL ARGS: {'complete_input_dict': {'inference_params': {}}}
2024-04-11 22:04:35 02:04:35 - LiteLLM:INFO: utils.py:1112 - {'model': 'gemini-1.5-pro-latest', 'messages': [{'role': 'user', 'content': 'What is your best girl?'}], 'optional_params': {}, 'litellm_params': {'acompletion': True, 'api_key': '', 'force_timeout': 600, 'logger_fn': None, 'verbose': False, 'custom_llm_provider': 'gemini', 'api_base': '', 'litellm_call_id': '361f8ad5-f82a-4a3a-a956-61f134ee904a', 'model_alias_map': {}, 'completion_call_id': None, 'metadata': {'user_api_key': 'sk-ultimate-mikuchat', 'user_api_key_alias': None, 'user_api_key_user_id': 'default_user_id', 'user_api_key_team_id': None, 'user_api_key_metadata': {}, 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'endpoint': 'http://localhost:6001/v1/chat/completions', 'model_group': 'gemini-1.5-pro-latest', 'deployment': 'gemini/gemini-1.5-pro-latest', 'model_info': {'id': '0cdd3aba7daa6215828fe92b268271942f92828eacd1378d53793435f1eddc90', 'description': 'gemini-1.5-pro-latest from Google Gemini Official. Mid-size multimodal model that supports up to 1 million tokens', 'max_tokens': 1048576}, 'caching_groups': None}, 'model_info': {'id': '0cdd3aba7daa6215828fe92b268271942f92828eacd1378d53793435f1eddc90', 'description': 'gemini-1.5-pro-latest from Google Gemini Official. Mid-size multimodal model that supports up to 1 million tokens', 'max_tokens': 1048576}, 'proxy_server_request': {'url': 'http://localhost:6001/v1/chat/completions', 'method': 'POST', 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'authorization': 'Bearer sk-ultimate-mikuchat', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'body': {'model': 'gemini-1.5-pro-latest', 'messages': [{'role': 'user', 'content': 'What is your best girl?'}]}}, 'preset_cache_key': None, 'no-log': False, 'stream_response': {}}, 'start_time': datetime.datetime(2024, 4, 12, 2, 4, 35, 912780), 'stream': False, 'user': 'default_user_id', 'call_type': 'acompletion', 'litellm_call_id': '361f8ad5-f82a-4a3a-a956-61f134ee904a', 'completion_start_time': None, 'input': ['What is your best girl?'], 'api_key': '', 'additional_args': {'complete_input_dict': {'inference_params': {}}}, 'log_event_type': 'pre_api_call'}
2024-04-11 22:04:35
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:936 - RAW RESPONSE:
2024-04-11 22:04:38 <google.generativeai.types.generation_types.AsyncGenerateContentResponse object at 0x7f5138384e10>
2024-04-11 22:04:38
2024-04-11 22:04:38
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: main.py:3835 - raw model_response: <google.generativeai.types.generation_types.AsyncGenerateContentResponse object at 0x7f5138384e10>
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:936 - Async Wrapper: Completed Call, calling async_success_handler: <bound method Logging.async_success_handler of <litellm.utils.Logging object at 0x7f51382598d0>>
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:936 - Logging Details LiteLLM-Success Call: None
2024-04-11 22:04:38 02:04:38 - LiteLLM Router:INFO: router.py:479 - litellm.acompletion(model=gemini/gemini-1.5-pro-latest) 200 OK
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:1288 - Model=gemini-1.5-pro-latest;
2024-04-11 22:04:38 02:04:38 - LiteLLM Router:DEBUG: router.py:1151 - Async Response: ModelResponse(id='chatcmpl-2bb1ff75-b5e2-4e7a-be97-9a19ed814f90', choices=[Choices(finish_reason='stop', index=1, message=Message(content='As an AI language model, I am not capable of having personal opinions or beliefs. Therefore, I do not have a "best girl" or any preferences of that nature. \n\nIs there anything else I can assist you with? \n', role='assistant'))], created=1712887478, model='gemini/gemini-1.5-pro-latest', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=6, completion_tokens=47, total_tokens=53))
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:4001 - completion_response response ms: 2919.216
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:936 - Logging Details LiteLLM-Async Success Call: None
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:936 - Looking up model=gemini/gemini-1.5-pro-latest in model_cost_map
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:1288 - Model=gemini-1.5-pro-latest;
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:1333 - Model=gemini-1.5-pro-latest not found in completion cost map.
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:4001 - completion_response response ms: 2919.216
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:936 - success callbacks: ['langfuse', <litellm.proxy.hooks.parallel_request_limiter._PROXY_MaxParallelRequestsHandler object at 0x7f513d613690>, <litellm.proxy.hooks.tpm_rpm_limiter._PROXY_MaxTPMRPMLimiter object at 0x7f513d613710>, <litellm.proxy.hooks.max_budget_limiter._PROXY_MaxBudgetLimiter object at 0x7f513f4481d0>, <litellm.proxy.hooks.cache_control_check._PROXY_CacheControlCheck object at 0x7f513d613750>]
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:936 - Looking up model=gemini/gemini-1.5-pro-latest in model_cost_map
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:1576 - reaches langfuse for success logging!
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:1333 - Model=gemini-1.5-pro-latest not found in completion cost map.
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:936 - Instantiates langfuse client
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:936 - Async success callbacks: [<litellm.proxy.hooks.parallel_request_limiter._PROXY_MaxParallelRequestsHandler object at 0x7f513d613690>, <litellm.proxy.hooks.tpm_rpm_limiter._PROXY_MaxTPMRPMLimiter object at 0x7f513d613710>, <litellm.proxy.hooks.max_budget_limiter._PROXY_MaxBudgetLimiter object at 0x7f513f4481d0>, <litellm.proxy.hooks.cache_control_check._PROXY_CacheControlCheck object at 0x7f513d613750>, <function _PROXY_track_cost_callback at 0x7f513d671f80>, <bound method ProxyLogging.response_taking_too_long_callback of <litellm.proxy.utils.ProxyLogging object at 0x7f513eda2f10>>]
2024-04-11 22:04:38 02:04:38 - LiteLLM Proxy:DEBUG: parallel_request_limiter.py:21 - INSIDE parallel request limiter ASYNC SUCCESS LOGGING
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: caching.py:21 - get cache: cache key: sk-ultimate-mikuchat::2024-04-12-02-04::request_count; local_only: False
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: caching.py:21 - get cache: cache result: None
2024-04-11 22:04:38 02:04:38 - LiteLLM Proxy:DEBUG: parallel_request_limiter.py:21 - updated_value in success call: {'current_requests': 0, 'current_tpm': 106, 'current_rpm': 2}, precise_minute: 2024-04-12-02-04
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: caching.py:21 - set cache: key: sk-ultimate-mikuchat::2024-04-12-02-04::request_count; value: {'current_requests': 0, 'current_tpm': 106, 'current_rpm': 2}
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: caching.py:21 - InMemoryCache: set_cache
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: caching.py:21 - get cache: cache key: default_user_id::2024-04-12-02-04::request_count; local_only: False
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: caching.py:21 - get cache: cache result: None
2024-04-11 22:04:38 02:04:38 - LiteLLM Proxy:DEBUG: parallel_request_limiter.py:21 - updated_value in success call: {'current_requests': 0, 'current_tpm': 106, 'current_rpm': 2}, precise_minute: 2024-04-12-02-04
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: caching.py:21 - set cache: key: default_user_id::2024-04-12-02-04::request_count; value: {'current_requests': 0, 'current_tpm': 106, 'current_rpm': 2}
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: caching.py:21 - InMemoryCache: set_cache
2024-04-11 22:04:38 02:04:38 - LiteLLM Proxy:DEBUG: tpm_rpm_limiter.py:33 - INSIDE TPM RPM Limiter ASYNC SUCCESS LOGGING
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: caching.py:21 - get cache: cache key: sk-ultimate-mikuchat; local_only: False
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: caching.py:21 - get cache: cache result: None
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: caching.py:21 - get cache: cache key: default_user_id; local_only: False
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: caching.py:21 - get cache: cache result: {'user_id': 'default_user_id', 'max_budget': None, 'spend': 0.0009649000000000001, 'model_max_budget': {}, 'model_spend': {}, 'user_email': None, 'models': [], 'tpm_limit': None, 'rpm_limit': None}
2024-04-11 22:04:38 02:04:38 - LiteLLM Proxy:DEBUG: proxy_server.py:1209 - INSIDE _PROXY_track_cost_callback
2024-04-11 22:04:38 02:04:38 - LiteLLM Proxy:DEBUG: proxy_server.py:1213 - Proxy: In track_cost_callback for: {'model': 'gemini-1.5-pro-latest', 'messages': [{'role': 'user', 'content': 'What is your best girl?'}], 'optional_params': {}, 'litellm_params': {'acompletion': True, 'api_key': '', 'force_timeout': 600, 'logger_fn': None, 'verbose': False, 'custom_llm_provider': 'gemini', 'api_base': '', 'litellm_call_id': '361f8ad5-f82a-4a3a-a956-61f134ee904a', 'model_alias_map': {}, 'completion_call_id': None, 'metadata': {'user_api_key': 'sk-ultimate-mikuchat', 'user_api_key_alias': None, 'user_api_key_user_id': 'default_user_id', 'user_api_key_team_id': None, 'user_api_key_metadata': {}, 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'endpoint': 'http://localhost:6001/v1/chat/completions', 'model_group': 'gemini-1.5-pro-latest', 'deployment': 'gemini/gemini-1.5-pro-latest', 'model_info': {'id': '0cdd3aba7daa6215828fe92b268271942f92828eacd1378d53793435f1eddc90', 'description': 'gemini-1.5-pro-latest from Google Gemini Official. Mid-size multimodal model that supports up to 1 million tokens', 'max_tokens': 1048576}, 'caching_groups': None}, 'model_info': {'id': '0cdd3aba7daa6215828fe92b268271942f92828eacd1378d53793435f1eddc90', 'description': 'gemini-1.5-pro-latest from Google Gemini Official. Mid-size multimodal model that supports up to 1 million tokens', 'max_tokens': 1048576}, 'proxy_server_request': {'url': 'http://localhost:6001/v1/chat/completions', 'method': 'POST', 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'authorization': 'Bearer sk-ultimate-mikuchat', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'body': {'model': 'gemini-1.5-pro-latest', 'messages': [{'role': 'user', 'content': 'What is your best girl?'}]}}, 'preset_cache_key': None, 'no-log': False, 'stream_response': {}}, 'start_time': datetime.datetime(2024, 4, 12, 2, 4, 35, 912780), 'stream': False, 'user': 'default_user_id', 'call_type': 'acompletion', 'litellm_call_id': '361f8ad5-f82a-4a3a-a956-61f134ee904a', 'completion_start_time': datetime.datetime(2024, 4, 12, 2, 4, 38, 831996), 'input': ['What is your best girl?'], 'api_key': '', 'additional_args': {'complete_input_dict': {}}, 'log_event_type': 'post_api_call', 'original_response': <google.generativeai.types.generation_types.AsyncGenerateContentResponse object at 0x7f5138384e10>, 'end_time': datetime.datetime(2024, 4, 12, 2, 4, 38, 831996), 'cache_hit': None, 'response_cost': None}
2024-04-11 22:04:38 02:04:38 - LiteLLM Proxy:DEBUG: proxy_server.py:1214 - kwargs stream: False + complete streaming response: None
2024-04-11 22:04:38 02:04:38 - LiteLLM Proxy:DEBUG: proxy_server.py:1283 - error in tracking cost callback - Model not in litellm model cost map. Add custom pricing - https://docs.litellm.ai/docs/proxy/custom_pricing
2024-04-11 22:04:38 response_obj: ModelResponse(id='chatcmpl-241f9dff-2e01-4a88-87ec-f1806eade18e', choices=[Choices(finish_reason='stop', index=1, message=Message(content='As an AI language model, I don\'t have personal preferences like having a "best girl." I can, however, provide you with information on various fictional female characters or help you explore different character archetypes if you\'d like! \n\nIs there anything specific you\'re interested in learning about? \n', role='assistant'))], created=1712887339, model='gemini/gemini-1.5-pro-latest', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=6, completion_tokens=59, total_tokens=65))
2024-04-11 22:04:38 getting usage, cost=None
2024-04-11 22:04:38 constructed usage - {'prompt_tokens': 6, 'completion_tokens': 59, 'total_cost': None}
2024-04-11 22:04:38 INFO: 172.25.0.1:55464 - "POST /v1/chat/completions HTTP/1.1" 200 OK
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:936 - Langfuse Logging - Enters logging function for model {'model': 'gemini-1.5-pro-latest', 'messages': [{'role': 'user', 'content': 'What is your best girl?'}], 'optional_params': {}, 'litellm_params': {'acompletion': True, 'api_key': '', 'force_timeout': 600, 'logger_fn': None, 'verbose': False, 'custom_llm_provider': 'gemini', 'api_base': '', 'litellm_call_id': '361f8ad5-f82a-4a3a-a956-61f134ee904a', 'model_alias_map': {}, 'completion_call_id': None, 'metadata': {'user_api_key': 'sk-ultimate-mikuchat', 'user_api_key_alias': None, 'user_api_key_user_id': 'default_user_id', 'user_api_key_team_id': None, 'user_api_key_metadata': {}, 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'endpoint': 'http://localhost:6001/v1/chat/completions', 'model_group': 'gemini-1.5-pro-latest', 'deployment': 'gemini/gemini-1.5-pro-latest', 'model_info': {'id': '0cdd3aba7daa6215828fe92b268271942f92828eacd1378d53793435f1eddc90', 'description': 'gemini-1.5-pro-latest from Google Gemini Official. Mid-size multimodal model that supports up to 1 million tokens', 'max_tokens': 1048576}, 'caching_groups': None}, 'model_info': {'id': '0cdd3aba7daa6215828fe92b268271942f92828eacd1378d53793435f1eddc90', 'description': 'gemini-1.5-pro-latest from Google Gemini Official. Mid-size multimodal model that supports up to 1 million tokens', 'max_tokens': 1048576}, 'proxy_server_request': {'url': 'http://localhost:6001/v1/chat/completions', 'method': 'POST', 'headers': {'accept': 'application/json', 'content-type': 'application/json', 'authorization': 'Bearer sk-ultimate-mikuchat', 'user-agent': 'PostmanRuntime/7.37.0', 'cache-control': 'no-cache', 'postman-token': '7bde7985-b26d-44dd-9908-a363902f6c1a', 'host': 'localhost:6001', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'content-length': '328'}, 'body': {'model': 'gemini-1.5-pro-latest', 'messages': [{'role': 'user', 'content': 'What is your best girl?'}]}}, 'preset_cache_key': None, 'no-log': False, 'stream_response': {}}, 'start_time': datetime.datetime(2024, 4, 12, 2, 4, 35, 912780), 'stream': False, 'user': 'default_user_id', 'call_type': 'acompletion', 'litellm_call_id': '361f8ad5-f82a-4a3a-a956-61f134ee904a', 'completion_start_time': datetime.datetime(2024, 4, 12, 2, 4, 38, 831996), 'input': ['What is your best girl?'], 'api_key': '', 'additional_args': {'complete_input_dict': {}}, 'log_event_type': 'successful_api_call', 'end_time': datetime.datetime(2024, 4, 12, 2, 4, 38, 831996), 'cache_hit': None, 'response_cost': None}
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:936 - OUTPUT IN LANGFUSE: {'content': 'As an AI language model, I am not capable of having personal opinions or beliefs. Therefore, I do not have a "best girl" or any preferences of that nature. \n\nIs there anything else I can assist you with? \n', 'role': 'assistant'}; original: ModelResponse(id='chatcmpl-2bb1ff75-b5e2-4e7a-be97-9a19ed814f90', choices=[Choices(finish_reason='stop', index=1, message=Message(content='As an AI language model, I am not capable of having personal opinions or beliefs. Therefore, I do not have a "best girl" or any preferences of that nature. \n\nIs there anything else I can assist you with? \n', role='assistant'))], created=1712887478, model='gemini/gemini-1.5-pro-latest', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=6, completion_tokens=47, total_tokens=53))
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:936 - Langfuse Layer Logging - logging to langfuse v2
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:936 - trace: None
2024-04-11 22:04:38 02:04:38 - LiteLLM:DEBUG: utils.py:936 - Langfuse Layer Logging - final response object: ModelResponse(id='chatcmpl-2bb1ff75-b5e2-4e7a-be97-9a19ed814f90', choices=[Choices(finish_reason='stop', index=1, message=Message(content='As an AI language model, I am not capable of having personal opinions or beliefs. Therefore, I do not have a "best girl" or any preferences of that nature. \n\nIs there anything else I can assist you with? \n', role='assistant'))], created=1712887478, model='gemini/gemini-1.5-pro-latest', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=6, completion_tokens=47, total_tokens=53))
2024-04-11 22:04:38 02:04:38 - LiteLLM:INFO: langfuse.py:161 - Langfuse Layer Logging - logging success
2024-04-11 22:04:39 02:04:39 - LiteLLM Proxy:DEBUG: utils.py:2104 - Team Spend transactions: 0
2024-04-11 22:04:39 02:04:39 - LiteLLM Proxy:DEBUG: utils.py:2154 - Spend Logs transactions: 0
So could we revert the issue title back?
Hi @CXwudi just ran this locally. Here's the attached image of the system prompt being sent.
We weren't logging the system prompt. This is now fixed - https://github.com/BerriAI/litellm/commit/7a3821e0f6700b3ccb5baa5d688ab48dde60c349 (should be live soon in v1.35.2
)
Here's the code: https://github.com/BerriAI/litellm/blob/c480b5a008b9cde5ca4c6fd8ce2c299d0f423478/litellm/llms/gemini.py#L186
Huh, that is wired, I gonna investigate more why I couldn't get my test JSON work..
Okie, I see the problem, currently we are using google-generativeai==0.3.2
in requirements.txt
, setting the version to 0.5.0
will works.
I will wait until that dependency is updated
thanks for the catch.
fix pushed @CXwudi https://github.com/BerriAI/litellm/commit/b0770cf8e20e9a814924b619e9ff872d024898e4
Is this the same issue as #3241?
@Manouchehri not really but can be similar, mine is about the system message, previously my issue is about the missing system message due to the outdated dependencies, but it is as well missing the system message in the log
Hey @CXwudi, unable to repro this - i setup vertex_ai/gemini-1.5-pro-preview-0409
on our staging env (v1.35.29) and just ran a test query. I can see the system prompt being logged to langfuse:
closing as this unable to repro. @CXwudi please bump me, if you're still seeing this, with a way to repro this.
Attaching a curl with a key for testing on our staging env(1RPM, will expire in 24hrs)
curl --location 'https://staging.litellm.ai/v1/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-Yx4jQsYBUlTkU51L5iltEA' \
--data '{
"model": "gemini-1.5-pro-latest",
"messages": [
{ "role": "system", "content": "Be a good bot" },
{ "role": "user", "content": "What'\''s the weather today?" }
]
}'
Hi @krrishdholakia, mine is reproducible with Google AI studio, like I mentioned in the issue description, the config I used is
model_list:
- model_name: gemini-1.5-pro-latest
litellm_params:
model: gemini/gemini-1.5-pro-latest
api_key: os.environ/GEMINI_API_KEY
I just tested and it is still missing:
closing as this unable to repro. @CXwudi please bump me, if you're still seeing this, with a way to repro this.
Attaching a curl with a key for testing on our staging env(1RPM, will expire in 24hrs)
curl --location 'https://staging.litellm.ai/v1/chat/completions' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer sk-Yx4jQsYBUlTkU51L5iltEA' \ --data '{ "model": "gemini-1.5-pro-latest", "messages": [ { "role": "system", "content": "Be a good bot" }, { "role": "user", "content": "What'\''s the weather today?" } ] }'
same issue here, reproduced with [1.40.22]
, the system was not passe
- litellm_params:
api_key: *****
model: gemini/gemini-1.5-pro-latest
model_name: gemini-1.5-pro
@antmanler thanks
able to repro