gnucobol
gnucobol copied to clipboard
Add a COB_LOAD_GLOBAL config option to modify `dlopen` behavior
This PR adds a config option called COB_LOAD_GLOBAL which allows the flags of dlopen used in call.c to be changed between RTLD_GLOBAL and RTLD_LOCAL (see https://linux.die.net/man/3/dlopen)
Motivation
This flag is motivated by an ambition to be able to write a multi-threaded c program, that then loads libcob with dlmopen in one namespace per thread, effectively allowing us to call the same libcob / Cobol module from multiple threads in a "thread-safe" way. The image below shows the a very simplified intended call sequence
Because there is a "bug" in dlopen that stops us from calling dlopen("module", RTLD_GLOBAL), this PR introduces an option to be able to change this flag to RTLD_LOCAL at runtime.
As at glibc 2.24, specifying the RTLD_GLOBAL flag when calling dlmopen() generates an error. Furthermore, specifying RTLD_GLOBAL when calling dlopen() results in a program crash (SIGSEGV) if the call is made from any object loaded in a namespace other than the initial namespace.
Docs
# Environment name: COB_LOAD_GLOBAL
# Parameter name: load_global
# Purpose: tell the system loader to provide symbols in CALLed
# programs globally, allowing symbols to be found for linked
# libraries later
# Type: boolean
# Note: COBOL CALLs will always find symbols in already CALLed or
# pre-modules; this setting is mostly an advise to the system,
# not all systems are capable of loading libraries global/local
# Default: false
# Example: load_global true
Implementation
- Change
lt_dlopenfrom a macro definition to an actual function, which internally calls dlopen with the flags based on the configuration - Add cob_load_global / COB_LOAD_GLOBAL as a configuration
- Add documentation to runtime.cfg about COB_LOAD_GLOBAL
Apart from the other PR's review to that branch, the change here makes the current scenario even more confusing. I'd like to fix this in this PR (or up-front upstream).
After the changes:
- instead of calling
lt_dlopenwe call a new static functioncob_dlopen - this has the current code of this PR in for the case of
dlopenusage - for the WIN32 variant it uses what is currently used there
- for the libtool variant it uses
lt_dlopenadvise()with an advise matching the runtime setting
@florianschmidt1994 can you please inspect the libtool variant (to test this you can simply undefine USE_LIBDL in config.h)?
I think I'll work on the cob_dlopen stuff late-evening.
Yup, I'll check it out. Basically something like this, right?
static void* cob_dlopen(const char* filename) {
#ifdef _WIN32
if (x == NULL) {
return GetModuleHandle (NULL);
}
return LoadLibrary(x);
#elif defined(USE_LIBDL)
int flags = cobsetptr->cob_load_global
? RTLD_LAZY | RTLD_GLOBAL
: RTLD_LAZY | RTLD_LOCAL;
return dlopen(filename, flags);
# else
// TODO: initialise lt_dladvise
lt_dlopenadvise(filename, /*advise here*/);
}
@GitMensch I've tried out your suggestion in the latest commit and temporarily defined lt_dlopen as cob_dlopen to try it out. I've also added some indentation to the macros, does that conform with your style or should I remove them again?
When developing, I've noticed some test failures, where some of them might be related to our change. On the latest commit of this repositories "main branch" (gcos4gnucobol-3.x), there are some test failures when the build is configured with --without-dl. These failures persist also with our changes applied.
| Commit Hash | Loader | COB_LOAD_GLOBAL | Test Failures |
|---|---|---|---|
| 731b81a327fe7108ca28d024a5ca13c9457dbcdc | --without-dl | not applicable | 47 49 50 784 failed |
| 731b81a327fe7108ca28d024a5ca13c9457dbcdc | not applicable | no test failure |
In the first version of this PR (bd80e8ce601b1663aa69793f086284c66fd2b463), there is an additional test failure (781) when using COB_LOAD_GLOBAL=false, i.e. the default, and preloading where symbols are overwritten in an unexpected manner.
| Commit Hash | Loader | COB_LOAD_GLOBAL | Test Failures |
|---|---|---|---|
| bd80e8ce601b1663aa69793f086284c66fd2b463 | false | 781 failed | |
| bd80e8ce601b1663aa69793f086284c66fd2b463 | true | no test failure | |
| bd80e8ce601b1663aa69793f086284c66fd2b463 | --without-dl | false | 47 49 50 781 failed |
| bd80e8ce601b1663aa69793f086284c66fd2b463 | --without-dl | true | 47 49 50 failed |
In history up to the latest commit (62de6254d0418b45e142b14f27245b481c1594b1), there exists a fix (see 1f4d60760e3fb908f9818d68ac0044fae7ee520a), the same that is already applied in WIN32 environments, to fix this.
| Commit Hash | Loader | COB_LOAD_GLOBAL | Test Failures |
|---|---|---|---|
| 62de6254d0418b45e142b14f27245b481c1594b1 | false | no test failure | |
| 62de6254d0418b45e142b14f27245b481c1594b1 | true | no test failure | |
| 62de6254d0418b45e142b14f27245b481c1594b1 | --without-dl | false | 47 49 50 failed |
| 62de6254d0418b45e142b14f27245b481c1594b1 | --without-dl | true | 47 49 50 failed |
:warning: Please install the to ensure uploads and comments are reliably processed by Codecov.
Codecov Report
Attention: Patch coverage is 70.58824% with 5 lines in your changes missing coverage. Please review.
Project coverage is 67.58%. Comparing base (
16e84a5) to head (3234374). Report is 44 commits behind head on gcos4gnucobol-3.x.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| libcob/call.c | 70.58% | 2 Missing and 3 partials :warning: |
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@ Coverage Diff @@
## gcos4gnucobol-3.x #209 +/- ##
=====================================================
- Coverage 67.85% 67.58% -0.28%
=====================================================
Files 33 33
Lines 60458 60827 +369
Branches 15821 15896 +75
=====================================================
+ Hits 41026 41109 +83
- Misses 13567 13847 +280
- Partials 5865 5871 +6
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
🚀 New features to boost your workflow:
- ❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Friendly ping @florianschmidt1994
@florianschmidt1994 thanks for the updates.
Please use the CI generated tarball, to check if your original setup that needed that change (a java starter running the same process more than once) still works. Deep-link to the CI generated source distribution: https://github.com/OCamlPro/gnucobol/actions/runs/16163870452/artifacts/3493326773
Also: can you please create a minimal runner in C that we can add to the testsuite that mimics your java starter? Currently we only verified that the new setting works with both options, but we don't have a test case that only passes with one of those...
The "coverage" titled test fails for C89 compat reasons:
+caller.c:5:22: warning: extra tokens at end of #include directive
+ 5 | #include <pthread.h> // For POSIX threads
+ | ^
+caller.c:6:22: warning: extra tokens at end of #include directive
+ 6 | #include <dlfcn.h> // For dynamic library loading
+ | ^
+caller.c:13:1: error: C++ style comments are not allowed in ISO C90
+ 13 | // Function to call COBOL shared library functions
+ | ^
+caller.c:13:1: note: (this will be reported only once per input file)
+caller.c: In function 'callCobol':
+caller.c:34:30: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
+ 34 | RESOLVE_SYMBOL(cob_init, cob_init_func);
+ | ^~~~~~~~~~~~~
+caller.c:25:5: note: in definition of macro 'RESOLVE_SYMBOL'
+ 25 | type symbol = (type) dlsym(libCobHandle, #symbol); \
+ | ^~~~
+caller.c:35:33: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
+ 35 | RESOLVE_SYMBOL(cob_resolve, cob_resolve_func);
+ | ^~~~~~~~~~~~~~~~
+caller.c:25:5: note: in definition of macro 'RESOLVE_SYMBOL'
+ 25 | type symbol = (type) dlsym(libCobHandle, #symbol); \
+ | ^~~~
+caller.c:36:30: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
+ 36 | RESOLVE_SYMBOL(cob_tidy, cob_tidy_func);
+ | ^~~~~~~~~~~~~
+caller.c:25:5: note: in definition of macro 'RESOLVE_SYMBOL'
+ 25 | type symbol = (type) dlsym(libCobHandle, #symbol); \
+ | ^~~~
+caller.c:38:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
+ 38 | hello_world_func hello_world = cob_resolve("prog");
+ | ^~~~~~~~~~~~~~~~
+caller.c: In function 'main':
+caller.c:60:5: error: 'for' loop initial declarations are only allowed in C99 or C11 mode
+ 60 | for (long i = 0; i < numThreads; i++) {
+ | ^~~
+caller.c:60:5: note: use option '-std=c99', '-std=gnu99', '-std=c11' or '-std=gnu11' to compile your code
+caller.c:67:14: error: conflicting types for 'i'; have 'int'
+ 67 | for (int i = 0; i < numThreads; i++) {
+ | ^
+caller.c:60:15: note: previous definition of 'i' with type 'long int'
+ 60 | for (long i = 0; i < numThreads; i++) {
+ | ^
+caller.c:67:5: error: 'for' loop initial declarations are only allowed in C99 or C11 mode
+ 67 | for (int i = 0; i < numThreads; i++) {
+ | ^~~
+cc1: some warnings being treated as errors
To reproduce that locally compile it with gcc -std=c89 -Werror=declaration-after-statement - in general you want variables to start at the begin of a block and use */ block comments */ .
and the MSYS1 entries error because of missing pthread support:
+caller.c:5:10: fatal error: pthread.h: No such file or directory
+ 5 | #include <pthread.h> // For POSIX threads
+ | ^~~~~~~~~~~
It is fine to skip this test on MSYS1. To do so add
# skip test for MSYS1 - we don't have pthread there
# FIXME: make that test more portable
AT_SKIP_IF([test "$OSTYPE" = "msys"])
at the start of the test.
Thanks for the hints @GitMensch. I've also disabled the tests on macOS, as there is no dlmopen there.
Test runs on Windows (e.g. https://github.com/OCamlPro/gnucobol/actions/runs/17068765301/job/48392123249?pr=209) say caller.c not found when trying to execute.
AT_CHECK([$COB_CC $COB_CFLAGS caller.c -ldl $COB_LIBS -o caller], [0], [], [])
Do you know if we might need to call the compiler arguments differently to make it more portable across platforms?
Also there's an issue with help2man in the Windows MSYS2 prepare workflow (https://github.com/OCamlPro/gnucobol/actions/runs/17069516851/job/48394605655?pr=209) that I don't quite understand, yet. Any ideas?
Thanks for the hints @GitMensch. I've also disabled the tests on macOS, as there is no
dlmopenthere.Test runs on Windows (e.g. https://github.com/OCamlPro/gnucobol/actions/runs/17068765301/job/48392123249?pr=209) say
caller.c not foundwhen trying to execute.AT_CHECK([$COB_CC $COB_CFLAGS caller.c -ldl $COB_LIBS -o caller], [0], [], [])
Use AT_CHECK([$COBC -x -ldl -lpthread -caller.c], [0], [], []) there (drop pthread if you think that's really not necessary).
The Windows tests likely haven't setup those vars correctly for some MSYS2 translation reason.