rig bug: Document Content Dropped by Some Providers When Using UserContent::Document

[X] I have looked for existing issues (including closed) about this

Bug Report

There's an issue with how document content is handled across different LLM providers. Currently, when using UserContent::Document for document content, the behavior varies inconsistently by provider:

Ollama: Completely drops document content from messages sent to the model
Anthropic: Processes the Document as a PDF and errors out
OpenAI: Correctly converts it to text content and works as expected

This bug was introduced in commit 2d45ad52f61dc8e21ab1e6fc08fe0096f3167ebf

Reproduction

Run the rag_ollama.rs example
Observe that document content is missing from the chat payload sent to the model

Expected behavior

The LLM should receive the document content as context so it can properly answer questions based on that context. Currently, with Ollama, this doesn't happen - the documents are processed but never included in the payload.

Proposed fix

Ensure all providers properly handle UserContent::Document by updating their TryFrom implementations to check for document content and convert it appropriately based on the DocumentMediaType.

For example, in Ollama's implementation, document content is currently dropped here:

match uc {
    crate::message::UserContent::Text(t) => texts.push(t.text),
    crate::message::UserContent::Image(img) => images.push(img.data),
    _ => {} // Document content is dropped here
}

And Anthropic always processes the Document as a PDF regardless of the actual content type:

message::UserContent::Document(message::Document { data, format, .. }) => {
    let source = DocumentSource {
        data,
        media_type: DocumentFormat::PDF,
        r#type: match format {
            Some(format) => format.try_into()?,
            None => SourceType::BASE64,
        },
    };
    Ok(Content::Document { source })
}

Temporary workaround: Our current fix is to use UserContent::Text instead of UserContent::Document in the normalized_documents() method, but the right approach would be for all providers to handle document content correctly.

May 11 '25 00:05 hollygrimm

RIG-740 bug: Document Content Dropped by Some Providers When Using UserContent::Document

May 11 '25 00:05 linear[bot]

Hi @hollygrimm, thanks for opening this PR!

re: ollama - do you have any examples of text you are sending through (or the types of documents you are sending)? The reason why we have disabled documents for ollama is that in their traditional format, they're not actually supported. So some pre-processing would need to be done.

re: Anthropic - yes, this looks totally wrong. Will be getting on top of this.

May 25 '25 16:05 joshua-mo-143

Amendment: It would seem that Anthropic seems to only support 4 types of images (JPG, PNG, GIF and WEBP) as well as PDF when it comes to using their API. I'll make some changes to enforce this so that it's obvious what types are/n't supported by Claude for now. WRT ollama, my previous point still stands.

References:

May 26 '25 12:05 joshua-mo-143

The ollama provider also drops ToolResults. This is pretty obvious if you look at the code, but wasn't mentioned in the issue description.

Jun 01 '25 06:06 lnicola

The ollama provider also drops ToolResults. This is pretty obvious if you look at the code, but wasn't mentioned in the issue description.

See #477 - hoping to get this merged before next release as it's a pretty big usability bug that was missed

Jun 01 '25 21:06 joshua-mo-143

I'm using rig-core with Ollama and spent a good long while trying to figure out why the rag-ollama.rs code copied verbatim wasn't working. Debugging shows the embeddings being processed but the actual prompt just doesn't contain the RAG text from the documents.

I'm still newish to Rust and LLMs, @hollygrimm can you share a code snippet for how I might get the rag-ollama.rs example to work apart from injecting the content into my prompt manually? It sounds like you might have found a workaround

Jul 11 '25 00:07 Michael-372