librseq icon indicating copy to clipboard operation
librseq copied to clipboard

Unable to dynamically link librseq with dlopen()

Open willowec opened this issue 9 months ago • 6 comments

Hello,

I have been attempting to dynamically link librseq using libdl's dlopen() and dlsym() functions. While attempting to do so, I have encountered a fairly confusing error. The minimum example and output are shown below:

/* A minimum example of dlsym failing to link librseq */
#include <dlfcn.h>
#include <stdio.h>

/* function pointers */
static int (*rseq_available_ptr)(unsigned int query);

/* local wrappers for function pointers */
static int librseq_rseq_available(unsigned int query) { return (*rseq_available_ptr)(query); }

int link_librseq()
{
	void* lib = dlopen("librseq.so", RTLD_NOW);
	if (!lib) {
		printf("Failed to load librseq: %s\n", dlerror());
		return -1;
	}

	rseq_available_ptr = dlsym(lib, "rseq_available");
	if (!rseq_available_ptr) {
		printf("Failed to access symbol 'rseq_available': %s\n", dlerror());
		return -1;
	}

	return 0;
} 

int main()
{
	if (link_librseq() < 0)
		return 1;

	/* ensure rseq is here */
	if (librseq_rseq_available(0)) {
		printf("Rseq available!\n");
	} else {
		printf("The rseq syscall is unavailable");
	}

	return 0;
}

and the output when I execute it:

$ ./dlsym_err 
Failed to load librseq: /usr/local/lib/librseq.so: cannot allocate memory in static TLS block

It is my understanding from a significant amount of reading through various forums, and particularly this writeup by Chao-tic that this may be related to the per-thread variable __thread struct rseq_abi __rseq_abi. Do you have any tips on how I might be able to get librseq to load with dlopen()? I'm not even quite sure that this is the right place to ask, sorry.

One thing that is interesting is that using LD_PRELOAD makes this issue disappear. Of course, requiring that the user run their programs with LD_PRELOAD is kind of frustrating, so hopefully there is a way to work around that.

$ LD_PRELOAD=/usr/local/lib/librseq.so ./dlsym_err
Rseq available!

willowec avatar Feb 05 '25 00:02 willowec

Upon further investigation, I have discovered that changing the tls_model attribute on line 74 of rseq.c from "initial-exec" to either "global-dynamic" or "local-dynamic" allows librseq to be loaded with dlopen! If it is acceptable to do so, I will investigate this more to make sure I understand the tls_model attribute and that this fix does not break anything before opening a PR.

Edit: Is there a resource anywhere on how to run tests for librseq? I'm having trouble figuring it out by just trawling through the tests directory.

willowec avatar Feb 10 '25 18:02 willowec

No, the global-dynamic TLS model won't work, because we require an offset from the thread pointer to be constant across threads, which AFAIK is not guaranteed by global-dynamic.

compudj avatar Feb 10 '25 18:02 compudj

You are right that the issue is caused by the initial-exec TLS though. Our plan is currently to drop support for rseq registration fallback and just rely on glibc. If users are then interested in reintroducing the librseq registration fallback for cases where glibc does not provide rseq support, those can either sponsor or contribute the feature.

Given that the main use-case for rseq EfficiOS has is LTTng-UST, we cannot rely on explicit thread registration, and therefore we have no incentive to continue supporting the fallback registration code.

compudj avatar Feb 10 '25 18:02 compudj

What about using "local-dynamic"? I haven't been able to find an answer in the GNU docs or ELF Handling For Thread-Local Storage. I would love to run some tests and try to figure this out, I am just having trouble figuring out how to execute the tests for this library!

I'm interested in librseq being compatible with dlopen because that is the best practice for linking external libraries to PAPI. If the fallback was dropped and we start using librseq, but then it is re-added and breaks things again, that would be unfortunate.

willowec avatar Feb 10 '25 21:02 willowec

Executing tests:

./configure make make check

We have a few ideas on how to reintroduce the fallback mechanism if need be, and we'd make sure it does not prevent dlopening librseq, because this is a use-case that our main project (lttng-ust) has.

The basic idea would be to make the fallback mechanism (which would need to include a initial-exec TLS) a separate library from librseq.so that users wishing to use the fallback would have to preload.

compudj avatar Feb 10 '25 21:02 compudj

AFAIU, both global and local dynamic TLS end up allocating memory dynamically on first use from each thread (lazy allocation) through execution of a TLS getter function. So the offset of the TLS entry within the DTV is constant for all threads, but the placement of the allocated memory holding the TLS data for each thread is not constant with respect to the thread pointer.

compudj avatar Feb 10 '25 21:02 compudj