supermemory Improve PDF handling: increase size limit and extract images from PDFs

@MaheshtheDev I've been testing supermemory and found two issues with PDF uploads:

Issue 1: 10MB limit is too small

Most research papers with images hit this limit
Users have to split files which breaks context

Issue 2: Images in PDFs are not processed

Charts, diagrams, and figures are ignored
Only text gets extracted
Important visual information is lost
Can't search for content that's in images

Example: I tried uploading a research paper with images - the images were completely ignored.

What I want to fix

I want to work on both of these issues:

Increase PDF limit
Extract and process images from PDFs

Why I'm the right person

I've identified exactly where the issues are
I understand the codebase structure
I have a clear solution in mind
Ready to write tests and docs

I want to take ownership of this issue and submit a PR.

@MaheshtheDev Let me know if this sounds good!

Nov 03 '25 17:11 karamvirsingh1998

ENG-365

Nov 03 '25 17:11 linear[bot]

@karamvirsingh1998 how are you planning on extracting and processing the images in the PDF?

Nov 03 '25 18:11 MaheshtheDev

Hey @MaheshtheDev I can work on the Issue 1, if there's a threshold size lemme know. Thanks

Nov 04 '25 13:11 AntonVishal

Sure @MaheshtheDev Just to outline the high-level design, the current implementation relies purely on OCR-based extraction — which, while functional, willfail to capture layout semantics and visual hierarchy The improved architecture should follow a multi-stage hierarchical pipeline:

Layout Detection & Structural Parsing: Use layout analysis to identify key regions
Visual Context Encoding via Visual LLMs: Each detected region is passed through a Visual Language Model
Text Semantic Chunking: The extracted textual regions are then semantically grouped
Contextual Reconstruction Layer: Finally, merge both visual and textual embeddings to form a context-aware document representation

it will retained page-level context and hierarchy

Nov 04 '25 15:11 karamvirsingh1998

Sure @MaheshtheDev Just to outline the high-level design, the current implementation relies purely on OCR-based extraction — which, while functional, willfail to capture layout semantics and visual hierarchy The improved architecture should follow a multi-stage hierarchical pipeline:

Layout Detection & Structural Parsing: Use layout analysis to identify key regions

Visual Context Encoding via Visual LLMs: Each detected region is passed through a Visual Language Model

Text Semantic Chunking: The extracted textual regions are then semantically grouped

Contextual Reconstruction Layer: Finally, merge both visual and textual embeddings to form a context-aware document representation

it will retained page-level context and hierarchy

Having all this done on client side or consumer app side doesn't seem ideal to me. Most probably this has to work with supermemory api itself. Will talk to team and get back to you soon

Nov 05 '25 02:11 MaheshtheDev

Yes not all thingass will be on Client side , let me know next steps @MaheshtheDev Thanks

Nov 05 '25 16:11 karamvirsingh1998

Hey @MaheshtheDev , I’ve explored similar PDF image extraction problems while experimenting with document processing pipelines. I’d love to take up this issue and work on enabling image processing for PDFs. Could you please assign this to me?

Nov 12 '25 14:11 ParagGhatage

Hey @MaheshtheDev , I’ve explored similar PDF image extraction problems while experimenting with document processing pipelines. I’d love to take up this issue and work on enabling image processing for PDFs. Could you please assign this to me?

thanks for exploring. however this issue deals with supermemory api related changes on the image processing with in the PDF.

Nov 14 '25 20:11 MaheshtheDev

Hey @MaheshtheDev , I’ve explored similar PDF image extraction problems while experimenting with document processing pipelines. I’d love to take up this issue and work on enabling image processing for PDFs. Could you please assign this to me?

thanks for exploring. however this issue deals with supermemory api related changes on the image processing with in the PDF.

thanks for the clarification @MaheshtheDev , I’m comfortable working on the Supermemory API side as well and would still like to take this issue. Let me know the constraints or direction, and I’ll proceed.

Nov 14 '25 20:11 ParagGhatage

Hey @MaheshtheDev would love to work on image processing with in the PDF can i propose my solution ?

Dec 03 '25 20:12 Elon7069