magentic icon indicating copy to clipboard operation
magentic copied to clipboard

Is it possible to provide a PDF in a prompt?

Open barapa opened this issue 1 year ago • 1 comments

I have tried supplying a PDF via the UserImageMessage type, but that fails with the following error from openai:

openai.BadRequestError: Error code: 400 - {'error': {'message': "You uploaded an unsupported image. Please make sure your image is below 20 MB in size and is of one the following formats: ['png', 'jpeg', 'gif', 'webp'].", 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_image_format'}}

Is there a way to do this with magentic?

barapa avatar Aug 19 '24 14:08 barapa

@barapa I think this out of scope for magentic. You will have to use https://github.com/Belval/pdf2image or another library to convert from PDF to one of those supported formats.

jackmpcollins avatar Aug 19 '24 19:08 jackmpcollins

Anthropic now allows sending PDF document bytes, and magentic https://github.com/jackmpcollins/magentic/releases/tag/v0.35.0 supports this using the DocumentBytes object. See https://magentic.dev/vision/#documentbytes


DocumentBytes is used to provide a document as bytes to the LLM. This is currently only supported by some Anthropic models.

from pathlib import Path

from magentic import chatprompt, DocumentBytes, Placeholder, UserMessage
from magentic.chat_model.anthropic_chat_model import AnthropicChatModel


@chatprompt(
    UserMessage(
        [
            "Repeat the contents of this document.",
            Placeholder(DocumentBytes, "document_bytes"),
        ]
    ),
    model=AnthropicChatModel("claude-3-5-sonnet-20241022"),
)
def read_document(document_bytes: bytes) -> str: ...


document_bytes = Path("...").read_bytes()
read_document(document_bytes)
# 'This is a test PDF.'

jackmpcollins avatar Jan 06 '25 03:01 jackmpcollins

Closing this issue as doing the image -> PDF conversion is out of scope for magentic. Hopefully OpenAI adds PDF support too. This would likely be mentioned on their Vision docs here: https://platform.openai.com/docs/guides/vision

jackmpcollins avatar Jan 06 '25 03:01 jackmpcollins