Is it possible to provide a PDF in a prompt?
I have tried supplying a PDF via the UserImageMessage type, but that fails with the following error from openai:
openai.BadRequestError: Error code: 400 - {'error': {'message': "You uploaded an unsupported image. Please make sure your image is below 20 MB in size and is of one the following formats: ['png', 'jpeg', 'gif', 'webp'].", 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_image_format'}}
Is there a way to do this with magentic?
@barapa I think this out of scope for magentic. You will have to use https://github.com/Belval/pdf2image or another library to convert from PDF to one of those supported formats.
Anthropic now allows sending PDF document bytes, and magentic https://github.com/jackmpcollins/magentic/releases/tag/v0.35.0 supports this using the DocumentBytes object. See https://magentic.dev/vision/#documentbytes
DocumentBytes is used to provide a document as bytes to the LLM. This is currently only supported by some Anthropic models.
from pathlib import Path
from magentic import chatprompt, DocumentBytes, Placeholder, UserMessage
from magentic.chat_model.anthropic_chat_model import AnthropicChatModel
@chatprompt(
UserMessage(
[
"Repeat the contents of this document.",
Placeholder(DocumentBytes, "document_bytes"),
]
),
model=AnthropicChatModel("claude-3-5-sonnet-20241022"),
)
def read_document(document_bytes: bytes) -> str: ...
document_bytes = Path("...").read_bytes()
read_document(document_bytes)
# 'This is a test PDF.'
Closing this issue as doing the image -> PDF conversion is out of scope for magentic. Hopefully OpenAI adds PDF support too. This would likely be mentioned on their Vision docs here: https://platform.openai.com/docs/guides/vision