pdfium-lib icon indicating copy to clipboard operation
pdfium-lib copied to clipboard

[feature] New way to load pages

Open paulocoutinhox opened this issue 1 year ago • 9 comments

Is your feature request related to a problem? Please describe.

A new way to load pages.

We need load the pages without load the full PDF file.

Describe the solution you'd like

We need use FPDF.LoadCustomDocument = Module.cwrap('FPDF_LoadCustomDocument', 'number', ['number', 'string']); to load document.

A pull-request is here, but is not working: https://github.com/paulocoutinhox/pdfium-lib/pull/129

Describe alternatives you've considered n/a

Additional context

When we load the full PDF file to memory, sometimes it crash, like in mobile.

paulocoutinhox avatar Sep 10 '24 19:09 paulocoutinhox

@CetinSert

Can you help me here?

paulocoutinhox avatar Sep 10 '24 19:09 paulocoutinhox

this will greatly improve the performance in mobile!

lowkahonn avatar Sep 12 '24 06:09 lowkahonn

any updates @paulocoutinhox @CetinSert? If you have no time to implement this but describe the implementation maybe we can give it a try and implement this.

pedrovgs avatar Oct 10 '24 14:10 pedrovgs

No, im started, but have some fixes to works. I made 90% of the job yet.

paulocoutinhox avatar Oct 10 '24 14:10 paulocoutinhox

Do you need to marshall FPDF_FILEACCESS from JS to WASM for it to work? Something like below (generated by AI)

function createFPDF_FILEACCESS(fileLength, m_GetBlockPtr, m_Param) {
  const structSize = 12; // 4 bytes x 3 fields
  const structPtr = wasmModule.instance.exports.malloc(structSize);
  const dataView = new DataView(
    wasmModule.instance.exports.memory.buffer,
    structPtr,
    structSize
  );

  dataView.setUint32(0, fileLength, true);     // m_FileLen
  dataView.setUint32(4, m_GetBlockPtr, true);  // m_GetBlock
  dataView.setUint32(8, m_Param, true);        // m_Param

  return structPtr;
}

// Usage:
const structPtr = createFPDF_FILEACCESS(fileLength, m_GetBlockPtr, 0);

glutio avatar Nov 01 '24 00:11 glutio

Hi @paulocoutinhox , This is exciting! Thank you for working on this. Do you think it is possible to use FPDF_LoadCustomDocument for fetch stream? I mean, without waiting for downloading the whole PDF file, can we quickly open the first page and the rest later? I see people mention about this feature here:

  • https://www.nutrient.io/guides/web/performance/streaming/
  • https://apryse.com/blog/mobile/streaming-a-pdf-from-the-web

In addition, looks like we need linearized PDFs and some proper backend support to make it work.

nguyenyou avatar Nov 02 '24 09:11 nguyenyou

Hello everyone. I saw the messages, but I can't focus right now because I'm finishing a freelance job. I'll get back to it in 1 or 2 weeks. I'll need to free up space on my machine, because I only have 10GB free. As soon as I find a solution in my tests, I'll let you know here.

paulocoutinhox avatar Nov 02 '24 22:11 paulocoutinhox

@paulocoutinhox Thanks for all your work on this!

We would really love this function to work so that we can stream big PDF files which takes a very long time now. Looking forward to it.

bobsingor avatar Jan 19 '25 13:01 bobsingor

Hi,

Now that i update all to latest version is more easy to work on this.

I will check it again in the next days.

Thanks.

paulocoutinhox avatar Jun 17 '25 03:06 paulocoutinhox