pdfium-lib icon indicating copy to clipboard operation
pdfium-lib copied to clipboard

Bitmaps larger than 495 × 495 × 4 leak in WASM

Open CetinSert opened this issue 4 years ago • 51 comments

Describe the bug FPDF.Bitmap_Destroy() leaks above a certain size in multiple browsers.

To Reproduce Steps to reproduce the behavior:

  1. Go to https://pdfviewer.github.io/

  2. Open developer tools

  3. Add 2 live expressions to watch: wasmMemory and wasmMemory.buffer.byteLength (only in Chrome-like browsers) image

  4. You can also open the browser task manager with: Shift + Esc, locate the PDF Viewer tab there and watch that

  5. Evaluate the following one block at a time in the developer console (reload tab in between each block for test hygiene)


(_PDFium_Init() is repeated for your copy+paste convenience and repeating/not-repeating it does not change the behavior.)

1. s = 1000; _free(_malloc(()) – evaluate in console ✔️ (no leak)

wasmMemory
wasmMemory.buffer.byteLength
_PDFium_Init(); var s = 1000; for (let i = 0; i < 100; i++) _free(_malloc(s * s * 4)); // ✔️
wasmMemory.buffer.byteLength
wasmMemory

2. s = 100; Destroy(CreateEx(s, s, 4) ✔️

_PDFium_Init(); var s = 100; for (let i = 0; i < 10; i++) FPDF.Bitmap_Destroy(FPDF.Bitmap_CreateEx(s, s, 4)); // ✔️

3. s = 495 ✔️

_PDFium_Init(); var s = 495; for (let i = 0; i < 10; i++) FPDF.Bitmap_Destroy(FPDF.Bitmap_CreateEx(s, s, 4)); // ✔️

4. s >= 496

_PDFium_Init(); var s = 496; for (let i = 0; i < 10; i++) FPDF.Bitmap_Destroy(FPDF.Bitmap_CreateEx(s, s, 4)); // ❌

Bindings to confirm bitmap behavior.

_PDFium_Init();
FPDF.Bitmap_GetWidth  = Module.cwrap('FPDFBitmap_GetWidth',  'number', ['number']);
FPDF.Bitmap_GetHeight = Module.cwrap('FPDFBitmap_GetHeight', 'number', ['number']);
FPDF.Bitmap_GetBuffer = Module.cwrap('FPDFBitmap_GetBuffer', 'number', ['number']);
FPDF.Bitmap_Create    = Module.cwrap('FPDFBitmap_Create',    'number', ['number', 'number', 'number']); // ⚠️ same issue

Expected behavior There should be no leaks at any size.

System (please complete the following information):

  • OS: Windows 10 Version 20H2 (OS Build 19042.1055)
  • Browsers: Chrome 91.0.4472.77; Firefox 89.0

CetinSert avatar Jun 13 '21 21:06 CetinSert

It is the same even if we tell FPDF to use an external buffer as in FPDF.Bitmap_CreateEx(width, height, format, buffer, ⋯).

CetinSert avatar Jun 13 '21 21:06 CetinSert

@paulo-coutinho Can you please test this in C++? Should the issue persist there, we can forward this issue to Google.

CetinSert avatar Jun 13 '21 22:06 CetinSert

References https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4505/fpdfsdk/fpdf_view.cpp#884 calls into retain_ptr.h https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4505/core/fxcrt/retain_ptr.h#62 https://pdfium.googlesource.com/pdfium/+/refs/heads/master/core/fxcrt/retain_ptr.h (no changes since 4505)

CetinSert avatar Jun 13 '21 22:06 CetinSert

Calling it with pauses between each call exhibits the same leak:

s = 495 ✔️

image

s = 496

image

CetinSert avatar Jun 13 '21 22:06 CetinSert

I have updated all to latest version: https://pdfviewer.github.io/

Can you check?

paulocoutinhox avatar Jun 14 '21 19:06 paulocoutinhox

The last version updated is here: https://github.com/paulo-coutinho/pdfium-lib/releases/tag/4542

paulocoutinhox avatar Jun 14 '21 22:06 paulocoutinhox

I have just checked. The issue persists and is suspected to come from upstream. image

CetinSert avatar Jun 14 '21 23:06 CetinSert

I tried contacting the project by email according to https://pdfium.googlesource.com/pdfium/ ➡️ https://groups.google.com/g/pdfium-bugs/about but was rejected as a non-Google-r.

There is another person who did mention a leak with a similar code path https://groups.google.com/g/pdfium-bugs/c/KO4Id_s4w-c/m/wqE9u_EuDQAJ although the conversation eventually fails to converge on Bitmap_Destroy and Bitmap_Create* and looks at other calls.

I suspect such leaks might be acceptable in the context PDFium is used in a browser: A user opening a single PDF document in its own web-isolated tab to view it for a short period of time. Bringing the project to web proper via WASM though opens up lots of multi-file, long-lived, highly-interactive use cases that are intolerant of any leaks – in my real-world use cases the WASM code would stop working(/rendering images) after less than 100 pages of a scanned PDF file complaining that wasmMemory cannot grow beyond 2GB.

CetinSert avatar Jun 14 '21 23:06 CetinSert

I see the following options remaining.

  1. check and address the issue outside WASM, in C/C++ (⭐⭐⭐⭐⭐)
    1. confirm it exists (@paulo-coutinho, can you help with this part? If not I will get started sometime within this week.)
    2. debug it there
    3. fix it
    4. create patch for at least your repository
  2. find a way to contact PDFium at google to let them know of the issue (⭐⭐ – must be done but will probably be ignored / take too long for an actual change)
  3. check Foxit's similar project sources F in FPDF stands for Foxit (the company Google licensed it from) (⭐⭐ – lots of searching)
  4. render in 495 × 495 × 4 fragments and stich the page together o__O (⭐/⭐⭐⭐⭐ – not sure if the APIs are really there but even if they are, can still have lots of issues with performance if single-threaded, visible seams at the boundaries, code complexity, etc.; if very carefully done and multi-threaded it might turn out to be faster than non-fragmented renders though (assuming seams are not an issue with their API))
  5. kill and restart the PDFium WASM runtime transparently on the first out of memory failure (⭐)
  6. use one web worker + PDFium WASM instance per loaded PDF file (this is how PDF.js works too – every loaded document, generates its own web worker / thread in the background until that document is closed); I was only using one web worker for all PDF files (⭐⭐ – there are likely other leaks (a few were even mentioned in their google group) and memory usage will eventually grow to huge sizes (pushing mobile devices out of reach for any real use); currently wasmMemory.buffer.byteLength starts at 16M which is acceptable even when repeated for each PDF file but I have not checked the other parts of the code path we follow to get page images in browsers; these might be memory-heavy and worse also leaky at other places)
  7. find a way to render pages avoiding the leaky Bitmap_Destroy and Bitmap_Create* functions (⭐⭐⭐⭐⭐ – but there might be none)

5 and 6 combined: (⭐⭐⭐)

CetinSert avatar Jun 14 '21 23:06 CetinSert

I posted to https://groups.google.com/g/pdfium instead: https://groups.google.com/g/pdfium/c/9nwjxUGUaQs.


Bitmaps larger than 495 × 495 × 4 leak with Bitmap_Destroy(Bitmap_Create*()))

This is being tracked here with a single-line test case: https://github.com/paulo-coutinho/pdfium-lib/issues/33.

We need upstream input, awareness/acknowledgement of the issue and hope it is addressed / fixed upstream.

Calling Bitmap_Destroy(Bitmap_Create*(495, 495, 4)) ✔️ is ok; Bitmap_Destroy(Bitmap_Create*(496, 496, 4)) ❌ starts leaking.

CetinSert avatar Jun 15 '21 00:06 CetinSert

Wow. Im writing a message now with the link to this issue. More 10 seconds and we post twice rsrsrsrsrs.

Thanks.

paulocoutinhox avatar Jun 15 '21 00:06 paulocoutinhox

I just checked out 4542 hoping I can build it in a simple manner but then https://chromium.googlesource.com/chromium/src/+/main/docs/linux/build_instructions.md#System-requirements hit me! I have everything except disk space (I have that too but in WSL-2 via Windows, which will be slow for builds but at least I can finally confirm what slowness everyone is talking about) and Tuesday morning patience/time for their build tools. I will revisit this later this week.

Have you ever made local C/C++ builds of PDFium?

Wow. Im writing a message now with the link to this issue. More 10 seconds and we post twice rsrsrsrsrs.

I really hope our posts to upstream pdfium gets noticed and acted on.

Thanks.

I do thank you too for making PDFium accessible to everyone on the web.

CetinSert avatar Jun 15 '21 00:06 CetinSert

https://groups.google.com/g/pdfium/c/9nwjxUGUaQs

If you think you found a bug in PDFium, then please file a bug report on https://crbug.com/pdfium. If possible, include a sample C/C++ program to demonstrate the issue.


And now we have: https://bugs.chromium.org/p/pdfium/issues/detail?id=1692

CetinSert avatar Jun 15 '21 00:06 CetinSert

(sorry I was trying to close the tab, not the issue o__O)

CetinSert avatar Jun 15 '21 00:06 CetinSert

Hi man.

I had a recent surgery and I'm in pain, so I can't always respond, but I saw what you did and I thank you for the excellent effort for the community.

I will wait they check your issue created.

paulocoutinhox avatar Jun 15 '21 00:06 paulocoutinhox

@paulo-coutinho – I wish you a swift and complete recovery! Please do rest well.

I will do my best to follow up with upstream and keep you in the know.

CetinSert avatar Jun 15 '21 00:06 CetinSert

Tested in C/C++. No leaks.

steps

  1. get https://github.com/bblanchon/pdfium-binaries/releases/latest/download/pdfium-linux.tgz
  2. extract into directory TEST
  3. cd TEST
  4. export PDFium_DIR=$(pwd)
  5. git clone https://github.com/bblanchon/pdfium-binaries.git
  6. cd pdfium-binaries/example
  7. cmake .
  8. replace example.c with the code given below
  9. clear; rm example; make; ./example
  10. check memory usage with a tool like htop at each pause
  11. no leaks ✔️

example.c

#include <fpdfview.h>
#include <stdio.h>

int main()
{
  printf("⌨ init ...\n");               /*getchar();*/                              FPDF_InitLibrary();
  printf("⌨ test  495 ×  495 × 4 ...\n"); getchar(); for (int i = 0; i < 1000; i++) FPDFBitmap_Destroy(FPDFBitmap_CreateEx( 495,  495, FPDFBitmap_BGRA, 0, 0));
  printf("⌨ test 4096 × 4096 × 4 ...\n"); getchar(); for (int i = 0; i < 1000; i++) FPDFBitmap_Destroy(FPDFBitmap_CreateEx(4096, 4096, FPDFBitmap_BGRA, 0, 0));
  printf("⌨ test 4096 × 4096 × 4 ...\n"); getchar(); for (int i = 0; i < 1000; i++) FPDFBitmap_Destroy(FPDFBitmap_CreateEx(4096, 4096, FPDFBitmap_BGRA, 0, 0));
  printf("⌨ test 4096 × 4096 × 4 ...\n"); getchar(); for (int i = 0; i < 1000; i++) FPDFBitmap_Destroy(FPDFBitmap_CreateEx(4096, 4096, FPDFBitmap_BGRA, 0, 0));
  printf("⌨ test 4096 × 4096 × 4 ...\n"); getchar(); for (int i = 0; i < 1000; i++) FPDFBitmap_Destroy(FPDFBitmap_CreateEx(4096, 4096, FPDFBitmap_BGRA, 0, 0));
  printf("⌨ test 4096 × 4096 × 4 ...\n"); getchar(); for (int i = 0; i < 1000; i++) FPDFBitmap_Destroy(FPDFBitmap_CreateEx(4096, 4096, FPDFBitmap_BGRA, 0, 0));
  printf("⌨ test 4096 × 4096 × 4 ...\n"); getchar(); for (int i = 0; i < 1000; i++) FPDFBitmap_Destroy(FPDFBitmap_CreateEx(4096, 4096, FPDFBitmap_BGRA, 0, 0));
  printf("⌨ destroy ...\n");              getchar();                                FPDF_DestroyLibrary();
  printf("⌨ exit ...\n");                 getchar();
  return 0;
}

This might be an issue only affecting WASM builds. As we have confirmed all is good in upstream and this is likely to be due to WASM build configuration / emscripten, we can now investigate this further at a very slow pace (days to weeks) prioritizing your recovery from the surgery above all.

CetinSert avatar Jun 15 '21 06:06 CetinSert

One interesting find I have just made is as follows:

496 × 596 × 4 ✔️

_PDFium_Init(); var w = 496, h = 596; for (let i = 0; i < 10000; i++) FPDF.Bitmap_Destroy(FPDF.Bitmap_CreateEx(w, h, 4)); [ wasmMemory, wasmMemory.buffer.byteLength ] // ✔️

image

496 × 496 × 4

_PDFium_Init(); var w = 496, h = 496; for (let i = 0; i < 10000; i++) FPDF.Bitmap_Destroy(FPDF.Bitmap_CreateEx(w, h, 4)); [ wasmMemory, wasmMemory.buffer.byteLength ] // ❌

image


Perhaps a memory alignment bug of some sort in the WASM build or emscripten itself?

CetinSert avatar Jun 15 '21 08:06 CetinSert

Discovery 🕵🏻‍♂️

_PDFium_Init(); var w = 496, h = 496; for (let i = 0; i < 100; i++) FPDF.Bitmap_Destroy(FPDF.Bitmap_CreateEx(w, h, 4)); [ wasmMemory, wasmMemory.buffer.byteLength ] // ❌

and the below leak the same amount

/* define NOP fpdf_bitmap_destroy() */                              fpdf_bitmap_destroy = b => {};
_PDFium_Init(); var w = 496, h = 496; for (let i = 0; i < 100; i++) fpdf_bitmap_destroy(FPDF.Bitmap_CreateEx(w, h, 4)); [ wasmMemory, wasmMemory.buffer.byteLength ] // ❌

so it is not a partial leak as I first thought but a complete failure. The whole memory block of the bitmap leaks.

References

FPDFBitmap_Destroy()chromium/4542/fpdfsdk/fpdf_view.cpp#884 is defined as

FPDF_EXPORT void FPDF_CALLCONV FPDFBitmap_Destroy(FPDF_BITMAP bitmap) {
  RetainPtr<CFX_DIBitmap> destroyer;
  destroyer.Unleak(CFXDIBitmapFromFPDFBitmap(bitmap));
}

RetainPtr<T>.Unleak()chromium/4542/core/fxcrt/retain_ptr.h#62 CFXDIBitmapFromFPDFBitmap()chromium/4542/fpdfsdk/cpdfsdk_helpers.h#78

Thoughts

This might be related to

  1. calling conventions
  2. memory alignment
  3. type cast failure in CFXDIBitmapFromFPDFBitmap()

FPDF.Bitmap_Destroy() might be the only place of failure by itself because it fails to work at some sizes whereas FPDF.Bitmap_CreateEx() does return something, at all tested sizes, that behaves as expected when probed with

FPDF.Bitmap_GetWidth  = Module.cwrap('FPDFBitmap_GetWidth',  'number', ['number']);
FPDF.Bitmap_GetHeight = Module.cwrap('FPDFBitmap_GetHeight', 'number', ['number']);

CetinSert avatar Jun 15 '21 15:06 CetinSert

🕵🏻‍♂️ Discovery

This is due to memory alignment!


_PDFium_Init();

How I got the data below! 👈🏻 (click for details)

image

  1. evaluate _FPDFBitmap_CreateEx in the developer console
  2. click on its definition in the console (bottom most line)
  3. click on the {} button to pretty print the code (top left)

image

  1. add breakpoint on line 777 (⚠ 778 after https://github.com/paulo-coutinho/pdfium-lib/issues/33#issuecomment-861996151)

🎰

  1. evaluate the below (step by pressing F8 once at a time)

495 × 495 × 4 ✔️

var w = 495, h = 495; for (let i = 0; i < 5; i++) _FPDFBitmap_Destroy(FPDF.Bitmap_CreateEx(w, h, 4)); [ wasmMemory, wasmMemory.buffer.byteLength ] // ✔️

In the good case, the memalign memset free calls happen only once!

  CreateEx
    memalign memset free
    memalign memset free
    memalign memset
  Destroy

  CreateEx
  Destroy

  CreateEx
  Destroy

  CreateEx
  Destroy

  CreateEx
  Destroy

and never again for either 495 × 495 × 4 or any other good dimensions we switch to, such as 100 × 100 × 4. (At this point, running the loop again with the same or other good numbers make 0 new calls to memalign.)


496 × 496 × 4

In this case, they keep happening every time we call FPDF.Bitmap_CreateEx() and lead to problems.

var w = 496, h = 496; for (let i = 0; i < 5; i++) _FPDFBitmap_Destroy(FPDF.Bitmap_CreateEx(w, h, 4)); [ wasmMemory, wasmMemory.buffer.byteLength ] // ❌
  CreateEx
    memalign memset free
    memalign memset free
    memalign memset
  Destroy

  CreateEx
    memalign memset free
    memalign memset free
    memalign memset
  Destroy

  CreateEx
    memalign memset free
    memalign memset free
    memalign memset
  Destroy

  CreateEx
    memalign memset free
    memalign memset free
    memalign memset
  Destroy

  CreateEx
    memalign memset free
    memalign memset free
    memalign memset
  Destroy

There are many more bad numbers like 496 × 496 × 4. Lots of real-life PDFs have pages with dimensions that trigger this leak.


Now we know more about why/how this happens, hopefully it will be easy to fix!

CetinSert avatar Jun 15 '21 16:06 CetinSert

I have updated all to latest commit and updated emscripten to latest version too.

You can test here: https://pdfviewer.github.io/

paulocoutinhox avatar Jun 16 '21 02:06 paulocoutinhox

I have updated all to latest commit and updated emscripten to latest version too.

Thank you! We have now ruled out another possibility: this is not caused by the specific emscripten version you were using before.

It still leaks:

image


As we can now reproduce it in a single line we should be in a position to try different compilation options rapidly. I bet there will be a flag or sth. that makes this just work. (I have not yet got to compile this with emscripten myself but will get to it within this week.)

In the meantime, I have posted this to emscripten to see if they can help us get past this faster.

CetinSert avatar Jun 16 '21 04:06 CetinSert

We might need to check these eventually:

  • go down from -O3 to -O1 or nothing
  • https://emscripten.org/docs/porting/Debugging.html#debugging-safe-heap
  • https://emscripten.org/docs/debugging/Sanitizers.html

FPDFBitmap_CreateEx call tree

  • FPDFBitmap_CreateEx – https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4542/fpdfsdk/fpdf_view.cpp#799
    • pdfium::MakeRetain – https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4542/core/fxcrt/retain_ptr.h#174
    • CFX_DIBitmap::Create – https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4542/core/fxge/dib/cfx_dibitmap.cpp#27
      • FX_TryAlloc – https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4542/core/fxcrt/fx_memory.h#48
        • Calloc – https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4542/core/fxcrt/fx_memory.cpp#109
          • PartitionAllocGenericFlags – https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4542/third_party/base/allocator/partition_allocator/partition_alloc.h#394
          • PartitionAllocatorGeneric – https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4542/third_party/base/allocator/partition_allocator/partition_alloc.h#517

😨 might be even incomplete yet that is some scary depth right there ... 🤞🏻 hope we get this issue sorted out just by finding a better emscripten configuration to compile with.

CetinSert avatar Jun 16 '21 07:06 CetinSert

@paulo-coutinho updated https://github.com/emscripten-core/emscripten/issues/14459 with all the details I can think of.

CetinSert avatar Jun 16 '21 09:06 CetinSert

It is the same even if we tell FPDF to use an external buffer as in FPDF.Bitmap_CreateEx(width, height, format, buffer, ⋯).

This turned out to be false. Using external memory is a valid workaround for Bitmap_CreateEx leaks

_PDFium_Init(); clear();
var XM = 1, A = 496, Z = 497, n = 100, CF = FPDF.Bitmap_BGRA, CS = 4, DS = new Set(), DF = new Set();
for (let w = A; w < Z; w++)
for (let h = A; h < Z; h++) {
  const a = wasmMemory.buffer.byteLength; const H = XM ? _malloc(w * h * CS) : 0;
  for (let i = 0; i < n; i++) _FPDFBitmap_Destroy(_FPDFBitmap_CreateEx(w, h, CF, H, w * CS));
  const z = wasmMemory.buffer.byteLength; if (XM) _free(H);
  const s = a == z ? DS : DF; s.add({ w, h, '*': w * h, '+': w + h, d: z - a });
}; [ (Z - A) ** 2, DS.size, DF.size ]
  • XM = 1 ✔️ (no leaks) (_FPDFBitmap_Destroy is not even necessary in this case.)
  • XM = 0 ❌ (leaks)

But now, RenderPageBitmap also leaks with many files!

  1. Watch wasmMemory.buffer.byteLength / 1024 / 1024
  2. Open https://pdf.ist/r/6365.pdf
  3. Render page no 1 repeatedly ...

(same issue, again due to memalign , this time no obvious workaround)


Single-link Reproduction

  1. Visit https://pdfviewer.github.io/?title=RenderPageBitmap%20Leaks&url=https://pdf.ist/r/6365.pdf
  2. Wait till tab finishes processing and shows the first page (can take long ...)
  3. Notice that this time RenderPageBitmap leaked and failed to render all pages
Chrome Firefox
image image

Needs more testing to confirm but perhaps only pages with embedded images leak? So, a remnant of issue https://github.com/paulo-coutinho/pdfium-lib/issues/26.


  • https://pdf.ist/r/1344.pdf – better file, only the 2. page with a tiny embedded image leaks in RenderPageBitmap!

CetinSert avatar Jun 16 '21 23:06 CetinSert

RenderPageBitmap leaks

Additional insight was gained after following https://github.com/emscripten-core/emscripten/issues/14459#issuecomment-863392152 in https://github.com/emscripten-core/emscripten/issues/14459#issuecomment-863567813.

Instrumentation

(() => {
  const ins = (f, n) => (...a) => { const v = f(...a); console.warn(n, f, v, '<-', a); return v; };
  const { malloc, memalign, memset, free, realloc } = Module.asm;
  Module.asm = { ...Module.asm,
    malloc:   ins(malloc,   'malloc  '),
    memalign: ins(memalign, 'memalign'),
    memset:   ins(memset,   'memset  '),
    free:     ins(free,     'free    '),
    realloc:  ins(realloc,  'realloc ')
  };
})();

https://pdf.ist/r/1344.pdf with PDFium 4543.

Instrumentation showing 💥 memalign calls in 1 × RenderPageBitmap call

📄 1 ✔️ 📄 2
image image
images none images some
malloc free
called by me
RenderPageBitmap
triggers memalign

Repeatedly rendering the above pages result in exactly the same number of console entries from the instrumentation, each time 4 and countless respectively.

Memory Growth after 45 × RenderPageBitmap calls

📄 1 ✔️ 📄 2
image image
image image

As the bottom right screenshot demonstrates, the RenderPageBitmap leak does not depend on the page render scale, which means the library would run out of the default 2GB wasmMemory rendering about 100 tiny thumbnails of pages that each contained tiny images.


After reading https://github.com/emscripten-core/emscripten/issues/14459#issuecomment-863392152, it does indeed seem like we are dealing with a fragmentation issue and

If it is fragmentation in fact, then the codebase might benefit from reusing buffers, using an arena, etc.

this is something we have done for Bitmap_CreateEx in https://github.com/paulo-coutinho/pdfium-lib/issues/33#issuecomment-862803506 by feeding it our own buffer but RenderPageBitmap has no such option.

Also if we consider the huge number of functions pdfium.wasm exports and how hard we hit issues with just the 2 we try to use, there will be other cases like RenderPageBitmap that run users out of memory very quickly although they call these functions as the PDFium documentation tells them to.

CetinSert avatar Jun 17 '21 21:06 CetinSert

References

memalign

  • https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4543/third_party/base/memory/aligned_memory.cc#30
  • https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4543/third_party/libopenjpeg20/0034-opj_malloc.patch#26
  • https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4543/third_party/libopenjpeg20/0034-opj_malloc.patch#34
  • https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4543/third_party/libopenjpeg20/opj_malloc.cc#21
  • https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4543/third_party/libopenjpeg20/opj_malloc.cc#29

CetinSert avatar Jun 18 '21 01:06 CetinSert

RenderPageBitmap leaks

Instrumentation (improved from https://github.com/paulo-coutinho/pdfium-lib/issues/33#issuecomment-863587472)

(() => {

  M = new Map();
  M.sum = () => [...M.values()].reduce((a, c) => a + c, 0);
  M.cnt = { malloc: 0, free: 0, sum: 0 };
  M.fn  = { malloc: 0, memalign: 1 };
  M.add = (n, a) => { M.   set(a, n); M.cnt.malloc++; M.cnt.sum += n; };
  M.sub =     a  => { M.delete(a   ); M.cnt.free++;                   };

  const ins = (f, n) => (...a) => {
    const v =  f(...a), d = a[M.fn[n]] ?? 0;
    if (d   >      0) M.add(d, v);
    if (n === 'free') M.sub(a[0]);
    console.warn(n.padEnd(8), f, `${v}`.padStart(20), '<-', (a.length == 1 ? [...a, '-'] : a).map(a => `${a}`.padStart(20)).join(' '), '|', `${wasmMemory.buffer.byteLength}`.padStart(20), '|', `${M.sum()}`.padStart(20), 'M', M.size); return v;
  };

  const { malloc, memalign, memset, free, realloc } = Module.asm;
  Module.asm = { ...Module.asm,
    malloc:   ins(malloc,   'malloc'),
    memalign: ins(memalign, 'memalign'/*.length == 8*/),
  //memset:   ins(memset,   'memset'),
    free:     ins(free,     'free'),
    realloc:  ins(realloc,  'realloc')
  };

})();
📄 1 ✔️ L 0 📄 2L 5
image image
image
function        ⋯    return    arguments                                        wasmMemory         tracked memory   L
=============== ⋯ =========    =========================================   ===============   ====================   =
malloc   ƒ 4497 ⋯  16584728 <-                   16                    - |        44040192 |                   16 M 1
free     ƒ 4498 ⋯ undefined <-             16584728                    - |        44040192 |                    0 M 0

where the last L column is for leaks = un-freed allocations and wasmMemory = wasmMemory.buffer.byteLength.

CetinSert avatar Jun 18 '21 02:06 CetinSert

https://manned.org/em++/664a9826

Manual page of em++ reads for -O3,

       -O3    As -O2, plus dangerous optimizations that may break the generated
              code! This adds

              -s FORCE_ALIGNED_MEMORY=1 -s DOUBLE_MODE=0 -s PRECISE_I64_MATH=0
              --closure 1 --llvm-lto 1

              This is not recommended at all. A better idea is to try each of
              these separately on top of -O2 to see what works. See the wiki and
              src/settings.js (for the -s options) for more information.

whereas this project does build with -O3 as seen here: https://github.com/paulo-coutinho/pdfium-lib/blob/c95bc4a72502251bd3ee770d57907812945c6a2d/modules/wasm.py#L677 and we are indeed having memory alignment issues.

CetinSert avatar Jun 18 '21 06:06 CetinSert

@paulo-coutinho – please note https://github.com/emscripten-core/emscripten/issues/14459#issuecomment-864373169

CetinSert avatar Jun 19 '21 09:06 CetinSert