libxcrypt icon indicating copy to clipboard operation
libxcrypt copied to clipboard

[RFC] [DON'T MERGE YET] Make crypt and crypt_gensalt use thread-local output buffers.

Open zackw opened this issue 7 years ago • 16 comments

This change makes crypt and crypt_gensalt as thread-safe as they can be without changing their interfaces. Solaris already made this change, and it’s being discussed by glibc (with suggestion that it should be pushed upstream to the C and POSIX standards committees): https://sourceware.org/ml/libc-alpha/2018-10/msg00437.html

Portable programs should still use the r-variants, though, because this is not a guaranteed feature, it doesn’t make them not clobber their output buffers on a second call, and the tradeoff is a sizeable memory leak (CRYPT_GENSALT_OUTPUT_SIZE or sizeof(crypt_data)) for each thread that calls either function. (Thanks ever so much, C committee, for adding thread-local variables without destructors for them.) (I’m still investigating whether there’s some GNU extension we could use to avoid the memory leak at least on Linux.) (We could use pthread_getspecific, but then we’d have to link libcrypt with libpthread, and this doesn’t seem like enough reason to do that.)

I'd like comments on this, but please don't merge it yet; the memory leak is a scary cost and I would like to be sure there's really no way to avoid it before merging this as-is.

(On the up side, this change shrinks libcrypt.so's .bss segment from 33160 to 192 bytes, which is a nice savings for programs that don't use these functions.)

zackw avatar Nov 12 '18 16:11 zackw

Oh good, running valgrind on our test suite does detect the memory leak that this introduces. :-)

zackw avatar Nov 12 '18 16:11 zackw

Mhh… Is there any chance to abuse __cxa_thread_atexit_impl to free() the pointers?

besser82 avatar Nov 12 '18 22:11 besser82

Codecov Report

Merging #62 into develop will decrease coverage by 0.05%. The diff coverage is 85.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           develop     #62      +/-   ##
==========================================
- Coverage    96.46%   96.4%   -0.06%     
==========================================
  Files           32      32              
  Lines         3109    3119      +10     
==========================================
+ Hits          2999    3007       +8     
- Misses         110     112       +2
Impacted Files Coverage Δ
crypt-gensalt-static.c 100% <100%> (ø) :arrow_up:
crypt-static.c 84.61% <77.77%> (-15.39%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 2e41352...b080758. Read the comment docs.

codecov-io avatar Nov 12 '18 23:11 codecov-io

@zackw I've just pushed a possible abuse of __cxa_thread_atexit_impl to free the pointers at thread termination. The only drawback: Only exists in glibc > 2.18, no other systems supported.

besser82 avatar Nov 12 '18 23:11 besser82

I am not sure that I understand the point of this change.

Current situation:

  • single threaded programs already work fine
  • multithreaded programs probably already use the reentrant API

After this patch:

  • single threaded programs will leak memory unless they switch to the reentrant API
  • no obvious need for the changed API

So this change would force long-lived daemons to use a different API while providing no obvious real world benefits.

rfc1036 avatar Nov 13 '18 00:11 rfc1036

@besser82 I was looking into just that this afternoon and I believe I know how to do it in a more universally supported way - the only wrinkle being that one file will need to be compiled as C++, which is enough of a wrinkle that I'm not gonna get to it this week, probably. (If I do it right, libcrypt.so will not require libstdc++.so at runtime.)

@rfc1036 The primary motivation for this change is to match what Solaris already does, and what has been proposed to be done in glibc with all the other APIs that have static internal buffers. See the thread I linked to at the beginning. A secondary motivation is to reduce the size of libcrypt.so's BSS. That said, I'm certainly open to the possibility that it's not worth bothering with.

zackw avatar Nov 13 '18 00:11 zackw

… that one file will need to be compiled as C++ …

@zackw Wouldn't that imply we need to link libstdc++.so? I mean, compared to that, it would be less overkill to simply link libpthread.so internally as it basically is already present on most systems due to systemd or cryptsetup-luks

besser82 avatar Nov 13 '18 01:11 besser82

Wouldn't that imply we need to link libstdc++.so

No. The short version is that the particular C++ feature needed for this (a static thread_local object, declared at function scope with constructor and destructor) winds up calling __cxa_thread_atexit_impl via a shim that GCC helpfully provides in a static-only library called libsupc++. As long as we don't use any other C++ features (e.g. continuing to do the large allocation with malloc instead of new), we don't need the full C++ runtime. And the advantage of doing it this way is we don't have to worry about whether __cxa_thread_atexit_impl is actually available; the shim takes care of that for us.

it would be less overkill to simply link libpthread.so internally

I feel like we may wind up going there eventually (for instance, I think I will need pthread_once and mutexes to do crypt.conf properly), and maybe that's the better route, but it does impose a cost. Not a portability cost -- threads are a mandatory feature in the current rev of POSIX, and any hypothetical configuration where libc is available but libpthread isn't is going to run into other problems pretty quick -- but a CPU time cost, because single-threaded programs that have libpthread loaded get stuck taking slower paths in several contexts within glibc proper (e.g. suddenly malloc thinks it needs to be doing actual locking).

zackw avatar Nov 13 '18 01:11 zackw

Resolved merge conflicts.

besser82 avatar Nov 18 '18 11:11 besser82

Rebased on recent develop, resolved merge conflicts.

besser82 avatar Jan 25 '19 10:01 besser82

I think this is a questionable feature, primarily because portable programs must not rely on it anyway, and then there's risk of someone porting system-specific code e.g. from Fedora to another system and not realizing they need to revise the uses of crypt*(). There's already such risk with ports from Solaris, but I think that's a poor excuse for making the matter worse. I wouldn't merge this unless and until glibc actually makes similar changes across the board.

solardiz avatar Jan 25 '19 11:01 solardiz

I like the NEWS file's wording "This feature is a safety net against sloppy coding. Programs are still strongly encouraged to use the reentrant functions instead". I think the man page's wording should be similar. As currently proposed, I think the man page doesn't sufficiently discourage reliance on the thread-safety. I think we should discourage such reliance even in non-portable programs.

solardiz avatar Jan 25 '19 11:01 solardiz

Wouldn't that imply we need to link libstdc++.so

No. The short version is that the particular C++ feature needed for this (a static thread_local object, declared at function scope with constructor and destructor) winds up calling __cxa_thread_atexit_impl via a shim that GCC helpfully provides in a static-only library called libsupc++. As long as we don't use any other C++ features (e.g. continuing to do the large allocation with malloc instead of new), we don't need the full C++ runtime. And the advantage of doing it this way is we don't have to worry about whether __cxa_thread_atexit_impl is actually available; the shim takes care of that for us.

Unfortunately libsupc++ isn't portable. Systems with Clang/LLVM only are not guaranteed to ship an equivalent shim-a-like library, tho some of them may ship an externally provided libcxxrt.

Given this fact and the very likely need for pthreads when implementing crypt.conf, I'm biased towards to start using pthreads here as well.

besser82 avatar Jan 27 '19 12:01 besser82

Anyways, looking at the code from (Open)Solaris their implementation of thread-safety in the crypt function isn't more then a mutex around the internal buffer of DES setkey and encrypt. They do not use TLS or any other advanced things.

besser82 avatar Jan 27 '19 13:01 besser82

Codacy Here is an overview of what got changed by this pull request:


Issues
======
+ Solved 7
- Added 4
           

See the complete overview on Codacy

besser82 avatar Feb 18 '20 13:02 besser82