ldap4net
ldap4net copied to clipboard
Fix memory leaks and random crashes
Creating a draft for discussion.
Background
I started investigating #136 because it is a massive blocker in our Linux environments. My test project (not included as it connects to corporate infra) does some large queries as well as a few single-entry fetches. After some iterations it consistently fails with different errors (Can't contact LDAP server. Result: -1; or -2; or assertion failures from the C wrapped libraries). I also measured memory usage with dotMemory, and found that it increases indefinitely when using a single LdapConnection
.
Memory leaks
I've discovered a few possible causes for the memory leaks, mostly around incorrectly released buffers. My process was to look at every allocation made through the native APIs and check the docs to see if everything is not released accordingly. Note that in some cases allocated buffers are freed differently on Windows and Linux, see https://linux.die.net/man/3/ber_scanf and https://learn.microsoft.com/en-us/windows/win32/api/winber/nf-winber-ber_scanf. To cover these differences, I added a new method BerScanfFree
to LdapNative
. The OSX version is just a copy of the Linux version, hopefully it will work (never tested).
After making these changes, memory usage is much better, with only minimal increase after each iteration. Later I might do some more digging and maybe submit a new PR if I can further improve it.
Threading
None of the fixes around buffers solved the randomly occurring errors, and after some digging I've found that libldap
before 2.4
versions is not multi-threaded and there's a threaded version of the library called libldap_r
. Changing my symlinks to point to that .so
made the random errors disappear completely. Starting with version 2.5, the threaded library is the default, so this problem only affects clients using 2.4 and older versions of OpenLDAP.
Hi, what is the current state of this PR? @BalassaMarton Are you using these fixes successfully in production? @flamencist Do you have time to review this and maybe even fix the build error? #103 seems related.
I recently observed this read access violation in a test environment:
(Windows 10 client connecting via SSL to an OpenLDAP server running on Linux)
Unhandled exception thrown: read access violation.
this->m_pUMThunkMarshInfo was nullptr.
Hi, what is the current state of this PR? @BalassaMarton Are you using these fixes successfully in production?
I don't work on that project anymore, but iirc using the multi-threaded LDAP binaries on Linux had solved our problems back then. I don't think your current issue on Windows is related.