PDCursesMod
PDCursesMod copied to clipboard
How to evade library mismatches?
(This is a note for future work, basically just to make sure the problem doesn't get completely forgotten about.)
We've had occasional issues where (for example) the PDCurses library is compiled with 32-bit chtypes and 8-bit characters, and then the actual program is compiled with 64-bit chtypes in wide-char mode. Everything links, and then you get bogus results such as shown in the black-and-white screenshot for issue #129. You can also get situations where everything works, except that the keyboard isn't recognized. Or certain characters come out corrupted (mojibake).
Not something for the currently pending 4.2.0 release, but I'm inclined to try to do some sort of function name mangling, such that in the above example, the PDCurses library would be compiled with, say, initscr() expanded to initscr_c32() and the program to initscr_c64_w(). A UTF8 build would expand that function to initscr_c64_u(). Then, when you tried to mislink, you'd get errors.
I chose initscr() in the above because it'll always be called and will therefore either be found in the library or not found.
Adding suffixes to the name of the libraries, so you'd have (for example) pdcurses_u32.lib and .dll for a build with 32-bit chtypes and UTF8, could also do the job. I'm a little fuzzy on exactly how we'd avoid library matches, but one or both of these solutions might do it.
No idea if he gets a notification when referenced here, but it would definitely be useful to check with @wmcbrine about an approach he may also use. What do you think about creating this issue at his tracker, too (limiting the issue to utf8+wide, of course)?
Well, I'll be gobsmacked... this might be easier than I thought. I just added the following lines to curses.h above the definition for initscr(), and it appears to do the trick nicely. Not committing it yet because I assume this change will break something, even if I've not figured out what yet.
#ifdef PDC_WIDE
#ifdef PDC_FORCE_UTF8
#ifdef CHTYPE_32
#define initscr initscr_u32
#else
#define initscr initscr_u64
#endif
#else
#ifdef CHTYPE_32
#define initscr initscr_w32
#else
#define initscr initscr_w64
#endif
#endif
#else /* 8-bit chtypes */
#ifdef CHTYPE_32
#define initscr initscr_x32
#else
#define initscr initscr_x64
#endif
#endif
The only thing this will break is that an existing library won't work on recompile and that a new compiled module won't work with an old library.
If you ask me: commit after adding an ifdef guard #ifndef initscr (and a "major change" note in HISTORY.md + the later release note) - this provides the option to use the old name with -Dinitscr=initscr if anyone sees this as necessary.
No idea if he gets a notification when referenced here
He does.
@wmcbrine Thank you for the note. While I don't want to pester you...
Do you have any opinion about the define ensuring that the libraries match the utf8/wide definitions in the calling program?
What do you think of including those defines into [pd]curses.h by actually generating it via make from curses.h.in?
I personally would like to see both (the first to secure the use of correct environment, the second to ease this for the programmer [while also verifying that the header variant matches the library via the first]).
Hmmm... well, we can get a workaround for the second case (old code will look for an un-mangled initscr() in a new library and can't find it) by having the new library include, after the usual initscr() code in pdcurses/initscr.c, something along the lines of the following (untested) code. The idea is that the library will have both the new initscr_u32() or similar, plus initscr(). New code will link to the former; old code to the latter.
static WINDOW *init_stub( void)
{
return( initscr( ));
}
#undef initscr
WINDOW *initscr( void)
{
return( init_stub( ));
}
It doesn't help with the first case (newly-compiled code will look for the mangled name in an old library that won't have it).
Can't say I'm totally sold on this idea and I'm in no rush to implement it. But it may be less stupid than I originally thought.
Might be good to add the major version number, so that it'd expand to (say) initscr_u32_4()` or such.
Might be good to add the major version number, so that it'd expand to (say) initscr_u32_4()` or such.
In this case I think it isn't a good idea - if William adds the same (which I hope) he won't have the 32/64 part in, so conflicts between this repo and his will still be visible without a _3/_4 suffix. And if ever a major version 5 comes out (I personally hope that William stays with 3 and you with 4) then the code from an older version should still work with it without relinking (as long as the chtype, utf8 and wide options are the same).
The idea is that the library will have both the new
initscr_u32()or similar, plusinitscr(). New code will link to the former; old code to the latter.
Seems useful for fall-back and also removes the need for the additional ifdef guard.
if William adds the same (which I hope) he won't have the 32/64 part in, so conflicts between this repo and his will still be visible
Well, we can avoid that by not suffixing a '32'. William's version lacks 64-bit chtypes; we can say that the absence of the number means a 32-bit chtype build. Though there is just enough in the way of incompatibility between our forks that being unable to cross-link may be a feature rather than a bug.
Though there is just enough in the way of incompatibility between our forks that being unable to cross-link may be a feature rather than a bug.
Totally, my point was that it is good that the "32"/"64"/none would be enough (and good) to ensure that cross-link and cross-use are not possible.
@Bill-Gray can you please commit your "easy solution" with the adjusted name of initscr via define? I still suggest to additional create curses.h during make soon to ease the building for the programmer, but that may be tracked with a follow-up issue.
I held off on committing that until I'd tried it a bit. I've done so, and nothing obvious has broken, so I've committed and pushed it.
I have not (yet) tried the init_stub() scheme. The more I think about that one, though, the more I start to think that getting a mislinkage there is telling you: you're linking old code to a new library; maybe you should just go ahead and recompile.
I agree, let's close this one. If you publish a beta2 I'll get the people recheck the reported color issue. ... in case you've missed it in the color issue: vt (win32 within msys) seems to be not working as expected (amount of displayed lines and overwrite).
... so with people using the installed 4.2beta an issue came up: Many programs are not directly linked against pdcurses but check which curses library and header is available and use whatever they find.
The way GnuCOBOL's configure.ac handles this is similar to what other projects do:
AC_CHECK_LIB([ncursesw], [initscr], [], [], [])
AC_CHECK_LIB([ncurses], [initscr], [], [], [])
AC_CHECK_LIB([pdcurses], [initscr], [], [], [])
AC_CHECK_LIB([curses], [initscr], [], [], [])
In this case only the library itself is inspected which now doesn't work any more:
configure:18217: checking for initscr in -lpdcurses
configure:18242: gcc -o conftest.exe -O2 -DCHTYPE_32 -DPDC_DLL_BUILD -I/mingw/include conftest.c -lpdcurses >&5
C:\Users\test\AppData\Local\Temp\ccISTw4m.o:conftest.c:(.text.startup+0xc): undefined reference to `initscr'
collect2.exe: error: ld returned 1 exit status
configure:18242: $? = 1
configure:18251: result: no
Of course people can change AC_CHECK_LIB for PDCurses to a "full compilation" and include the header - but we really should consider if we want to force people to do this (the current workaround is to manually define the "correctly named" function).
I'm having some second thoughts about this "improvement". The idea is not totally stupid, but there are some issues with it.
One possible alternative we might eventually use would be to have the libraries and DLLs named, for example, pdcurses_u64.lib, pdcurses_u64.dll, etc. for Windows.
For the moment, since we're hoping to get a stable release, I'm inclined to revert my "easy solution" and leave this as something to be figured out in a later version. Unless I hear a contrary argument, I'll go ahead and do that.
While it looks more intrusive to adjust the library names it may be the best option. The biggest issue is that everyone builds against libpdcurses, if you change that name then many build systems could not use it out of the box... If I remember correctly all ports but X11 don't have an install option, in this case it would be only the already manual process of installing a new PDCurses version (which would then rename pdcurses_u64.lib to pdcurses.lib so the external linking will work as expected while the running program would still look for pdcurses_u64.dll).
I'm having some second thoughts about this "improvement". The idea is not totally stupid, but there are some issues with it.
Hm, I've just thought about an option that would only "break" on using initscr() from within a source actually including the header, not sure if this is better or worse, but it would
- remove the need to redefine it manually during the configure process
- will still work for older versions
- still will fail linking if the header is included (which is normally the case) and the defines in use aren't matching the library
- con: would not raise an issue any more when swapping a DLL to the wrong version
Sounds good?
Draft:
- internally rename initscr to
initscr_private - create two new exported functions
initscr_externalandinitscr_internalwhich both just call ourinitscr_private - in pdcurses.h: change the new redefines to not redefine
initscrbut redefineinitscr_external - if actually building PDCurses: define
initscrasinitscr_external, otherwise defineinitscrasinitscr_private
This way we end up with both the "new typed" and the "old" name exported, both will function; as soon as the header is included only the "new typed" will be called - if there's a mismatch you'd know this at link time.
Opinions?
Just coming back to this, in the process of checking outstanding issues...
This seems reasonable. However... a year has passed. I originally called my posted solution "not totally stupid". It actually appears to be working well enough. I've had no further issues with library mismatches, nor has anybody asked about such (and they used to occur with some frequency). I think I'll call this a decently solved problem and close it.
Thanks for checking. I really think the most reasonable way would be to create the "matching" pdcurses.h during make, something along:
- have a pdcurses.h target, which all other targets depend on
- have pdcurses.options as the the sole dependency of the pdcurses.h target
- when creating libpdcurses:
- create pdcurses.options.new; just the active defines echo'ed as-is
diff/fcit against pdcurses.stamp, if it is isn't different delete, otherwise- move it to pdcurses.options
$(MAKE) create-header
- have a .PHONY create-header target which creates pdcurses.h from pdcurses.h.in, just
echothe additional defines, orawk,sed, whatever
Hmmm... at present, running make install for the X11 port results in copying curses.h into /usr/local/include/xcurses (among other things). This brings up a variety of possibilities:
-- When copying, it could also insert #define PDC_WIDE 1 and/or #define PDC_FORCE_UTF8 1, etc. as required. (For X11, #define XCURSES would always be inserted.)
-- The other ports, at present, lack the concept of a configure script. If they had one, they could similarly insert required #defines. Failing that, running, for example,
make install UTF8=Y WIDE=Y CHTYPE_32=1 DLL=Y
could insert whichever #defines were appropriate. As you suggest, we'd make the appropriate curses.h file, but we'd make one when running make install.
Your scheme, as I understand it, could lend itself to another nice thing : if you've run make with a particular set of options and then run it again with a different set of options, and the diff between them is non-zero, then remove all objects and force a full rebuild. I've occasionally messed that up; I'll build a non-WIDE version, change a few sources, build WIDE, only recompile those few, and have a mix-and-match problem. (I can't say this has been a major annoyance, but it does happen.) With your scheme, curses.h would be revised and would force everything to be rebuilt.
Your scheme, as I understand it, could lend itself to another nice thing : if you've run make with a particular set of options and then run it again with a different set of options, and the diff between them is non-zero, then remove all objects and force a full rebuild.
Yes, that's the point - and the reason to do this on make of the libraries, not on make install.
If you do want to have that "someday" in PDCurses I suggest to reopen this issue.
... and for XCurses curses.h should be created by the configure script with the relevant defines, if this isn't the case than I should be able to take care of this.
Hm, I think this is still open until we actually adjust the pdcurses.h on install.
This --should also fix the broken AC_CHECK_LIB solution if the people add the header inclusion there (I'd at least do that for GnuCOBOL)-- has nothing to do which AC_CHECK_LIB which still will be broken by that. But for GnuCOBOL I'll likely work around that by searching for endwin instead of initsrc.
@Bill-Gray re-open for automatic header adjustment during install?
A variant of my previous suggestion:
- two new targets are added: configure and pdcurses.h
- pdcurses.h has a dependency to SRCDIR pdcurses.h
- pdcurses.h is a dependency of the pdcurses object files - so when missing will be executed
- it takes the current make arguments and creats pdcurses.h from those and its dependency
- configure will delete an existing pdcurses.h and then re-trigger it
This way:
- the build recipe used by people will stay valid as-is
- if people that manually copied pdcurses.h from root will still need to use the defines, when they use the one created in the port then they don't need to do that any more
I should have mentioned that some time back, I tackled what is probably the "tricky" part of this. I wrote a bit of code to configure/reconfigure curses.h. Basic reasoning is described in comments at the start of that code.
Already, one can build config_curses on Linux, Windows, or DOS, then run (for example) ./config_curses -DPDC_WIDE -DCHTYPE_32 and get curses.h appropriately configured. If it's actually the same configuration as before, curses.h is left unchanged; otherwise, of course, it changes, and your next make will force a complete rebuild.
A similar task could almost certainly have been done with bash and such, but this works on any platform with a C compiler.
That leads to the question of how such a tool would be used. Ideally, it would mean that when you run (say) make -Dwhatever, config_curses is always built first (if necessary) and then run with those switches. Thus, any change in the switches you feed make causes a rebuild, and we won't interfere with existing workflows (no need to add a separate ./config_curses step, since it'll be baked into the Makefile.)
That's beautiful. If this is put into the makefiles then everything should "just work".
We still have autoconf's "simple" AC_CHECK_LIB[initscr] and AC_CHECK_FUNC checks broken. The first could be replaced by checking endwin or by switching the check (additional, for PDCurses only) to AC_TRY_LINK directly, which lets us include a header).
... Or: we add initscr back which does nothing else but calling the internal defined function - then an application explicit checking for that symbol finds it (both via configure scripts or via dynamic loading of the function) [of course then without the link-time check], but any program that includes the header and tries to link the result (which should be the majority) will use the ABI-depending entry point directly - and therefore get a link error if there's a mismatch between curses header and library ABI (or at runtime when using the wrong share object an error from the dynamic link loader).
So I totally vote for both curses_configure and internal initscr function to not break existing configure scripts and delay-loading of the library! Reopen? :-)
I hesitated to reopen, because the current method of modifying initscr does seem to have mostly fixed the initial problem. In a way, we're now drifting more to the issue of having PDC_WIDE, PDC_DLL_BUILD, PDC_FORCE_UTF8, etc. #defined within curses.h. It's related, but different. Still... guess the current discussion is close enough that I'll reopen, rather than leave closed/start a new issue.
(With the warning that I may not be doing a heck of a lot of work on this short-term.)
I hesitated to reopen, because the current method of modifying
initscrdoes seem to have mostly fixed the initial problem.
It has - but created two big new ones (both in the previous version and in the current), which is a possibly reason that the approach did not reach "upstream" yet:
- the only way to make PDCursesMod work "with any configure script out there" is to manually define the
initscrsymbol to the right one viaCPPFLAGS- because they just check "do I have xyzcurses and does it help me to link initscr"; and the people don't even know this (because we don't have explicit docs about that) - the only way to have PDCursesMod working with delay-load is to adjust the programs that previously
dlopen/LoadLibrary'd pdcurses.dll and then got theinitscrfunction looked up now need to dynamic load a different symbol
One may could argue that PDCursesMod is broken in both scenarios since two versions.
The one thing that is related but - I agree - less important is to dynamically generate pdcurses.h, because this is about removing the need to manually define the correct matching options that were used during the build (which was always needed, even if cumbersome and partially worked around by people creating a pkconfig file [for posix / posix-like environments] or otherwise matching "port definition" [like it is done with vcpkg]). It is very useful to "finally fix that for PDCurses*" (I hope @wmcbrine would include a "proven to be working" generation of pdcurses.h), but as noted, is less important as the dynamic generated entry point name (which I hope to be taken over when we finally get it working for all scenarios, too).
... so, if you agree, but don't see any option to work on the entry point issue (actually I don't have the spare time either), then I try to do so. The point is to additional have an initscr external entry point calling the "real" one which is renamed. Effect:
- configure scripts and dynamic loading will still work as it used to be, as long as the defines match the ABI, otherwise will be broken as it was before
- the most people that use a "include pdcurses.h and link" approach will get a link error when the ABI doesn't match [which after the auto-generated pdcurses.h only means they use an old or not matching header]
- people that use a pdcurses shared object file that doesn't match the ABI won't get runtime aborts or strange results but the dynamic link loader will say that the expected symbol cannot be found so won't start (or won't dynamic load libraries that were linked against a different version of pdcurses)
I have no say in the compatibility side of things, but just as a comment on the efficacy of the initscr mangling - it worked on me! I threw PDCursesMod into python's windows-curses library to try and fix some issues with panels (it didn't, oh well...) and was forced to fix the names of the character width defines so the library linked correctly.
Whatever ends up happening to make PDCursesMod work with more build scripts without having to modify them - I think the name mangling is an excellent solution to user mistakes.
Looking back through this, I think I may have been a little stupid about one basic point, from @GitMensch's comment on 2021 Dec 6.
Am I correct in thinking that the autoconf check in question is solely on initscr(), such that if the name mangling done in curses.h was applied to endwin(), that particular issue would go away?
There is no real reason I can see that name mangling has to occur on initscr() instead of endwin(). (It does have to occur on a function that is always going to be used, which pretty much limits the choice of mangleable functions to initscr() and endwin().)
If that would actually do the trick, all I'd need do is go through curses.h and change six #define initscr initscr_... lines to read #define endwin endwin_....
Still leaves us with the desirability of using config_curses.c or similar code, of course.
Am I correct in thinking that the autoconf check in question is solely on
initscr(), such that if the name mangling done incurses.hwas applied toendwin(), that particular issue would go away?
Commonly.
There is no real reason I can see that name mangling has to occur on
initscr()instead ofendwin(). (It does have to occur on a function that is always going to be used, which pretty much limits the choice of mangleable functions toinitscr()andendwin().)If that would actually do the trick, all I'd need do is go through
curses.hand change six#define initscr initscr_...lines to read#define endwin endwin_....
And do a release.
Still leaves us with the desirability of using
config_curses.cor similar code, of course.
Totally! Until that is done the docs may should suggest to manually adjust the installed header. config_curses would be perfectly tested in the CI builds, too, as the artifacts would contain different ones, then.