[vertexai] Inconsistent behavior when using responseSchema with Gemini Flash models, malformed or repetitive JSON output unless schema is removed
We’ve encountered a reproducible issue when using the Java Vertex AI client (com.google.cloud:google-cloud-vertexai:1.18.0) with Gemini Flash models for structured text generation.
When a responseSchema is attached to the GenerationConfig, the model intermittently produces malformed or repetitive JSON outputs, often looping text fragments or inserting stray newline escape sequences until the max output token limit is reached.
Removing the schema entirely eliminates the issue, and the same prompt setup works correctly in the Python Vertex AI SDK, suggesting this may be SDK-specific or related to how the Java client serializes the schema.
Environment details
| Key | Value |
|---|---|
| API | Vertex AI Generative AI (Java) |
| Library | com.google.cloud:google-cloud-vertexai:1.18.0 |
| Java version | 21 |
| OS | Windows 11 |
| Models tested | Gemini 2.0 Flash, Gemini 2.5 Flash Lite, Gemini 2.5 Flash |
| Behavior | Issue occurs with 2.0 Flash and 2.5 Flash Lite; 2.5 Flash mitigates it partially |
Steps to reproduce
-
Configure a
GenerativeModelwith deterministic decoding:-
temperature = 0.0f,topP = 0.0f,topK = 1,candidateCount = 1,seed = 42 -
responseMimeType = "application/json"
-
-
Attach a complex
responseSchemadescribing nested arrays and objects (see example below). -
Send a document-extraction prompt requesting structured JSON per the schema.
-
Observe that:
-
The model often ignores the schema’s structure.
-
Output becomes recursive or repetitive (
"Company Company Company..."). -
Output terminates abruptly at token limit with unclosed quotes or brackets.
-
-
Remove the schema (keep all other settings identical).
-
Observe that the output is now clean and well-formed JSON.
Code snippet (simplified)
GenerationConfig cfg = GenerationConfig.newBuilder()
.setTemperature(0.0f)
.setTopP(0.0f)
.setTopK(1)
.setCandidateCount(1)
.setSeed(42)
.setResponseMimeType("application/json")
.setResponseSchema(ResponseSchemaFactory.getExtractionSchema()) // When set, issue occurs
.build();
GenerativeModel model = baseModel
.withSystemInstruction(ContentMaker.fromString(systemPrompt))
.withGenerationConfig(cfg);
GenerateContentResponse response = model.generateContent(promptText);
String jsonOutput = response.getText(); // often malformed
Example schema shape:
Schema workExperience = Schema.newBuilder()
.setType(Type.OBJECT)
.putProperties("company", Schema.newBuilder().setType(Type.STRING).build())
.putProperties("tenure", Schema.newBuilder().setType(Type.STRING).build())
.putProperties("skills", Schema.newBuilder()
.setType(Type.ARRAY)
.setItems(Schema.newBuilder().setType(Type.STRING).build())
.build())
.build();
Observed output (excerpt, simulated)
{
"experience": [
{
"company": "TechCorp TechCorp TechCorp TechCorp TechCorp ...",
"tenure": "2 yrs",
"skills": ["Java", "Spring Boot"]
}
],
"summary": "\n\n {\n.\n.\n\\n\\n\\n\\n\\n\\n\\n\\n\n"
}
Occasionally, the output fails JSON parsing due to missing closing quotes or brackets:
com.fasterxml.jackson.core.io.JsonEOFException: Unexpected end-of-input:
was expecting closing quote for a string value
at [Source: (String)"{ "experience": [ { "company": "ABC
"tenure": "3 yrs"...]; line: 1, column: 4211]
Expected behavior
When responseSchema is provided, the model should consistently honor the schema and produce syntactically valid JSON following the defined structure.
Additional context
-
Removing the schema entirely fixes the problem.
-
Using identical prompts and schema definitions in Python Vertex AI SDK does not reproduce the issue.
-
Switching to Gemini 2.5 Flash improves output stability, possibly due to increased reasoning or token budget.
-
This suggests the issue may lie in schema serialization or how the Java SDK encodes the request payload.
Would appreciate guidance on whether this is:
-
A known limitation or bug in the Java Vertex AI client,
-
A misalignment between the Java SDK’s schema format and backend expectations,
-
Or a potential model-side behavior that needs handling guidance.
Additional context
We’re using a URL-based input for the PDF rather than passing raw file bytes, which might be influencing the behavior — not fully certain, but worth mentioning. Our current invocation looks like this (simplified):
@Override
public ResumeExtraction extractResumeData(String resumeUrl) {
if (resumeUrl == null || resumeUrl.trim().isEmpty()) {
throw new IllegalArgumentException("Resume URL cannot be null or empty");
}
// simplified: prompt construction and downstream logic omitted
return invokeModelAndDeserialize(
ContentMaker.fromMultiModalData(
PartMaker.fromMimeTypeAndData("application/pdf", resumeUrl),
promptText
),
ResumeExtraction.class
);
}
In our Python setup, we instead read the PDF as bytes and send it directly to the model. Not 100% sure if this URI-based handling difference in the Java SDK could be affecting how the schema or payload is processed, but flagging it in case it’s relevant.
Hi @Programmer-RD-AI, thanks for reporting this issue. It does look like a service-level limitation due to newer models being more stable. I'll leave this to @jaycee-li for a final say.
Please note that you can ensure fast response times if you have a support plan in our Support Hub.
Hi @Programmer-RD-AI , this package doesn't officially support gemini 2.0+ models and is now deprecated. If you're developing a new project, please use the Google GenAI Java SDK instead. The new SDK contains all the latest Gemini models and features.
Examples for response schema: GenerateContentWithResponseSchema.java, GenerateContentWithResponseJsonSchema.java
Please let me know if you still get the error in the GenAI Java SDK.
And FYI, the Python Vertex AI SDK is deprecated as well, the new one is https://github.com/googleapis/python-genai