fix(vertex_ai): Resolve JSONDecodeError in Gemini streaming
Resolve JSONDecodeError in Gemini streaming
Relevant issues
Fixes #16562
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
- [x] I have Added testing in the
tests/litellm/directory, Adding at least 1 test is a hard requirement - see details - [x] I have added a screenshot of my new test passing locally
- [x] My PR passes all unit tests on
make test-unit - [x] My PR's scope is as isolated as possible, it only solves 1 specific problem
Type
🐛 Bug Fix ✅ Test
Changes
Problem
The streaming parser for Vertex AI Gemini models (ModelResponseIterator) would crash with a JSONDecodeError if a partial (fragmented) JSON chunk was received from the stream after the first complete chunk had already been processed. This caused intermittent but critical failures in production environments.
Root Cause
The error handling logic in the handle_valid_json_chunk method contained a guard condition (if self.sent_first_chunk is False:). This condition only allowed the JSON accumulation/buffering logic to be triggered for the very first chunk in the stream. If any subsequent chunk was fragmented, the condition would be false, and the JSONDecodeError would be re-raised instead of handled.
Solution
The fix removes the self.sent_first_chunk is False guard. Now, any JSONDecodeError will correctly trigger the switch to the JSON accumulation mode (handle_accumulated_json_chunk). This makes the stream parser robust and allows it to correctly buffer and assemble fragmented JSON objects at any point during the stream, not just at the beginning.
Testing
- Added two new unit tests to
tests/test_litellm/llms/vertex_ai/gemini/test_vertex_and_google_ai_studio_gemini.py:test_streaming_iterator_handles_partial_json_after_first_chunk_sync: Verifies the fix for synchronous streams.test_streaming_iterator_handles_partial_json_after_first_chunk_async: Verifies the fix for asynchronous streams.
- These tests simulate a stream where a complete JSON chunk is followed by a fragmented one. They failed with a
RuntimeError: Error parsing chunk...before the fix and now pass, confirming the bug is resolved.
@AlanPonnachan is attempting to deploy a commit to the CLERKIEAI Team on Vercel.
A member of the Team first needs to authorize it.
@krrishdholakia please review