google-cloud-java [vertexai] Inconsistent behavior when using responseSchema with Gemini Flash models, malformed or repetitive JSON output unless schema is removed

We’ve encountered a reproducible issue when using the Java Vertex AI client (com.google.cloud:google-cloud-vertexai:1.18.0) with Gemini Flash models for structured text generation.

When a responseSchema is attached to the GenerationConfig, the model intermittently produces malformed or repetitive JSON outputs, often looping text fragments or inserting stray newline escape sequences until the max output token limit is reached.
Removing the schema entirely eliminates the issue, and the same prompt setup works correctly in the Python Vertex AI SDK, suggesting this may be SDK-specific or related to how the Java client serializes the schema.

Environment details

Key	Value
API	Vertex AI Generative AI (Java)
Library	com.google.cloud:google-cloud-vertexai:1.18.0
Java version	21
OS	Windows 11
Models tested	Gemini 2.0 Flash, Gemini 2.5 Flash Lite, Gemini 2.5 Flash
Behavior	Issue occurs with 2.0 Flash and 2.5 Flash Lite; 2.5 Flash mitigates it partially

Steps to reproduce

Configure a GenerativeModel with deterministic decoding:
- temperature = 0.0f, topP = 0.0f, topK = 1, candidateCount = 1, seed = 42
- responseMimeType = "application/json"
Attach a complex responseSchema describing nested arrays and objects (see example below).
Send a document-extraction prompt requesting structured JSON per the schema.
Observe that:
- The model often ignores the schema’s structure.
- Output becomes recursive or repetitive ("Company Company Company...").
- Output terminates abruptly at token limit with unclosed quotes or brackets.
Remove the schema (keep all other settings identical).
Observe that the output is now clean and well-formed JSON.

Code snippet (simplified)

GenerationConfig cfg = GenerationConfig.newBuilder()
    .setTemperature(0.0f)
    .setTopP(0.0f)
    .setTopK(1)
    .setCandidateCount(1)
    .setSeed(42)
    .setResponseMimeType("application/json")
    .setResponseSchema(ResponseSchemaFactory.getExtractionSchema()) // When set, issue occurs
    .build();
GenerativeModel model = baseModel
.withSystemInstruction(ContentMaker.fromString(systemPrompt))
.withGenerationConfig(cfg);
GenerateContentResponse response = model.generateContent(promptText);
String jsonOutput = response.getText(); // often malformed

Example schema shape:

Schema workExperience = Schema.newBuilder()
  .setType(Type.OBJECT)
  .putProperties("company", Schema.newBuilder().setType(Type.STRING).build())
  .putProperties("tenure", Schema.newBuilder().setType(Type.STRING).build())
  .putProperties("skills", Schema.newBuilder()
      .setType(Type.ARRAY)
      .setItems(Schema.newBuilder().setType(Type.STRING).build())
      .build())
  .build();

Observed output (excerpt, simulated)

{
  "experience": [
    {
      "company": "TechCorp TechCorp TechCorp TechCorp TechCorp ...",
      "tenure": "2 yrs",
      "skills": ["Java", "Spring Boot"]
    }
  ],
  "summary": "\n\n {\n.\n.\n\\n\\n\\n\\n\\n\\n\\n\\n\n"
}

Occasionally, the output fails JSON parsing due to missing closing quotes or brackets:

com.fasterxml.jackson.core.io.JsonEOFException: Unexpected end-of-input:
 was expecting closing quote for a string value
 at [Source: (String)"{ "experience": [ { "company": "ABC
 "tenure": "3 yrs"...]; line: 1, column: 4211]

Expected behavior

When responseSchema is provided, the model should consistently honor the schema and produce syntactically valid JSON following the defined structure.

Additional context

Removing the schema entirely fixes the problem.
Using identical prompts and schema definitions in Python Vertex AI SDK does not reproduce the issue.
Switching to Gemini 2.5 Flash improves output stability, possibly due to increased reasoning or token budget.
This suggests the issue may lie in schema serialization or how the Java SDK encodes the request payload.

Would appreciate guidance on whether this is:

A known limitation or bug in the Java Vertex AI client,
A misalignment between the Java SDK’s schema format and backend expectations,
Or a potential model-side behavior that needs handling guidance.

Oct 30 '25 05:10 prdai

Additional context

We’re using a URL-based input for the PDF rather than passing raw file bytes, which might be influencing the behavior — not fully certain, but worth mentioning. Our current invocation looks like this (simplified):

@Override
public ResumeExtraction extractResumeData(String resumeUrl) {
  if (resumeUrl == null || resumeUrl.trim().isEmpty()) {
    throw new IllegalArgumentException("Resume URL cannot be null or empty");
  }

  // simplified: prompt construction and downstream logic omitted
  return invokeModelAndDeserialize(
      ContentMaker.fromMultiModalData(
          PartMaker.fromMimeTypeAndData("application/pdf", resumeUrl),
          promptText
      ),
      ResumeExtraction.class
  );
}

In our Python setup, we instead read the PDF as bytes and send it directly to the model. Not 100% sure if this URI-based handling difference in the Java SDK could be affecting how the schema or payload is processed, but flagging it in case it’s relevant.

Oct 30 '25 06:10 prdai

Hi @Programmer-RD-AI, thanks for reporting this issue. It does look like a service-level limitation due to newer models being more stable. I'll leave this to @jaycee-li for a final say.

Please note that you can ensure fast response times if you have a support plan in our Support Hub.

Nov 07 '25 16:11 diegomarquezp

Hi @Programmer-RD-AI , this package doesn't officially support gemini 2.0+ models and is now deprecated. If you're developing a new project, please use the Google GenAI Java SDK instead. The new SDK contains all the latest Gemini models and features.

Examples for response schema: GenerateContentWithResponseSchema.java, GenerateContentWithResponseJsonSchema.java

Please let me know if you still get the error in the GenAI Java SDK.

And FYI, the Python Vertex AI SDK is deprecated as well, the new one is https://github.com/googleapis/python-genai

Nov 07 '25 20:11 jaycee-li