ldap4net icon indicating copy to clipboard operation
ldap4net copied to clipboard

Fix memory leaks and random crashes

Open BalassaMarton opened this issue 1 year ago • 2 comments

Creating a draft for discussion.

Background

I started investigating #136 because it is a massive blocker in our Linux environments. My test project (not included as it connects to corporate infra) does some large queries as well as a few single-entry fetches. After some iterations it consistently fails with different errors (Can't contact LDAP server. Result: -1; or -2; or assertion failures from the C wrapped libraries). I also measured memory usage with dotMemory, and found that it increases indefinitely when using a single LdapConnection.

Memory leaks

I've discovered a few possible causes for the memory leaks, mostly around incorrectly released buffers. My process was to look at every allocation made through the native APIs and check the docs to see if everything is not released accordingly. Note that in some cases allocated buffers are freed differently on Windows and Linux, see https://linux.die.net/man/3/ber_scanf and https://learn.microsoft.com/en-us/windows/win32/api/winber/nf-winber-ber_scanf. To cover these differences, I added a new method BerScanfFree to LdapNative. The OSX version is just a copy of the Linux version, hopefully it will work (never tested).

After making these changes, memory usage is much better, with only minimal increase after each iteration. Later I might do some more digging and maybe submit a new PR if I can further improve it.

Threading

None of the fixes around buffers solved the randomly occurring errors, and after some digging I've found that libldap before 2.4 versions is not multi-threaded and there's a threaded version of the library called libldap_r. Changing my symlinks to point to that .so made the random errors disappear completely. Starting with version 2.5, the threaded library is the default, so this problem only affects clients using 2.4 and older versions of OpenLDAP.

BalassaMarton avatar Jun 08 '23 10:06 BalassaMarton

Hi, what is the current state of this PR? @BalassaMarton Are you using these fixes successfully in production? @flamencist Do you have time to review this and maybe even fix the build error? #103 seems related.

I recently observed this read access violation in a test environment: (Windows 10 client connecting via SSL to an OpenLDAP server running on Linux) image

Unhandled exception thrown: read access violation.
this->m_pUMThunkMarshInfo was nullptr.

vossjannik avatar Jun 18 '24 19:06 vossjannik

Hi, what is the current state of this PR? @BalassaMarton Are you using these fixes successfully in production?

I don't work on that project anymore, but iirc using the multi-threaded LDAP binaries on Linux had solved our problems back then. I don't think your current issue on Windows is related.

BalassaMarton avatar Jun 19 '24 08:06 BalassaMarton