libgphoto2 icon indicating copy to clipboard operation
libgphoto2 copied to clipboard

Replace all `#define _*_SOURCE` with centralised config

Open RReverser opened this issue 1 year ago • 19 comments

Slightly cleaner alternative to #843, #615, #552 and others that ensures we always have those extension functions at autoconf level instead of having individual (and inconsistent) #defines in source files.

RReverser avatar Oct 30 '22 17:10 RReverser

I have two remarks about this:

AC_USE_SYSTEM_EXTENSIONS compatibility with autoconf versions

This PR requires AC_USE_SYSTEM_EXTENSIONS which e.g. introduced _DARWIN_C_SOURCE in commit 9e33646 (2013-02-06) which is post autoconf-2.69 (2012-04-25). At this time, our configure.ac files only AC_PREREQ([2.69]). So if our release tarballs are built on a system with post-2.69 autoconf (probably autoconf-2.71), they will be buildable on e.g. OSX. And if somebody builds on/for OSX from a git clone, they will have to make sure to use autoconf-2.71 or later by themselves.

Do we actually want consistent #defines for all compilation units?

I do not remember whether we require inconsistent #defines in different compilation units because we might e.g. use one kind of strerror_r function definition in one compilation unit and the other function definition in another compilation unit. Have you checked that?

It might be non-trivial to check that as just enabling all compiler warnings produces so many of them with the current codebase that any changes to the set of warnings produced are difficult to see.

Conclusion

If/When those two remarks can be addressed/resolved, I like this.

ndim avatar Oct 31 '22 00:10 ndim

This PR requires AC_USE_SYSTEM_EXTENSIONS which e.g. introduced _DARWIN_C_SOURCE in commit 9e33646 (2013-02-06) which is post autoconf-2.69 (2012-04-25).

I'm happy to replace with manual defines. I suppose _DEFAULT_SOURCE should be enough there (UPD: just pushed)? Or would we want explicit (deprecated) variants _POSIX_SOURCE / _BSD_SOURCE / something else instead?

because we might e.g. use one kind of strerror_r function definition in one compilation unit and the other function definition in another compilation unit. Have you checked that?

That would be pretty surprising, but sure, worth checking. [...] I just tried couple of methods I could think of - sha256sum of all object files, collecting preprocessed sources, few others, but I'm not getting deterministic enough results for them to be useful to compare :(

RReverser avatar Oct 31 '22 03:10 RReverser

Ahh right I suppose the object files I got are not deterministic due to LINE used in various error loggers (which changed now). I'll play a bit more.

RReverser avatar Oct 31 '22 03:10 RReverser

Ok I spent unreasonable amount of time on this given that it's not really even important, but I think I got there 😅

--- sha256-before.log   2022-10-31 03:25:00.192112000 +0000
+++ sha256-after.log    2022-10-31 03:28:58.502112000 +0000
@@ -179,7 +179,7 @@
 6fc7f33fd2aab861743093ae3a582db7b9de684adb9515b9b6f65406387e6d36  ./camlibs/topfield/.libs/la-tf_bytes.o
 b5d6382c2d29fb112590ef4472d18f703b194454dd718792a812de51f1d1c5e3  ./camlibs/topfield/.libs/la-usb_io.o
 1816f93b7fce6e443c7c86e439c686bd6616cff5eddca6ea7d9461190c92b4f9  ./camlibs/toshiba/pdrm11/.libs/la-library.o
-08764a8af34118a885821ed43efa80a3b2d46f059059bf56b0ced2c7d836822c  ./camlibs/toshiba/pdrm11/.libs/la-pdrm11.o
+5a15f3a5c6ee908cae1f350db7b4b8228e7b60c7bd022b666f38a311184161ac  ./camlibs/toshiba/pdrm11/.libs/la-pdrm11.o
 23e22a439777082acc6c5eaa216e7241a1d58ab21e62458cf7a23bd5bdf87957  ./camlibs/tp6801/.libs/la-library.o
 d727f1039d0b8e0b7462caaa1b7c70740e9594a091939e0b593622a0e7f0daae  ./camlibs/tp6801/.libs/la-tp6801.o
 bc1de4ffd5d150ab1a8c49ea863ad6ec2cc77e115e2aff4528f46a9bc1656d96  ./examples/autodetect.o
@@ -214,7 +214,7 @@
 3ff37d1ad5587dcde7c1bb67e96ef9dffcb887e3cca03ac1d2d6030ee3018a5d  ./libgphoto2/.libs/libgphoto2_la-gphoto2-widget.o
 6e7e8346eb4aa6c76c368a9f8a28e04aa35ba082da27280e82f6ea4fcd086170  ./libgphoto2/.libs/libgphoto2_la-jpeg.o
 5c37d3744667213ede3078b7e0149b5d465c1e07f0c675b40a5cc1cb92678dd5  ./libgphoto2_port/disk/.libs/la-disk.o
-3d57472d6edfbc546ee49197e748352cca424aae59fb1e882e98678b7b445e2b  ./libgphoto2_port/libgphoto2_port/.libs/libgphoto2_port_la-gphoto2-port-info-list.o
+df1e0655abf4936f7f232466e05b6ce8d1fec171cc58fb9c23376cf51f584dc5  ./libgphoto2_port/libgphoto2_port/.libs/libgphoto2_port_la-gphoto2-port-info-list.o
 a80acf8228fcbd36a916a0313aaf6e031c3b2d8119ac05c0d97c88a9d734bc72  ./libgphoto2_port/libgphoto2_port/.libs/libgphoto2_port_la-gphoto2-port-log.o
 2f14039c0af4641622602596c906cdcd8b71ee21f424d41d834d8fe47dc80d2e  ./libgphoto2_port/libgphoto2_port/.libs/libgphoto2_port_la-gphoto2-port-portability.o
 aa9447f5933d719a250f659273aa531dacc9606986137a17b585050738bd03e5  ./libgphoto2_port/libgphoto2_port/.libs/libgphoto2_port_la-gphoto2-port-result.o

I'm not yet sure why it's always the Toshiba's camlibs/toshiba/pdrm11 and gphoto2-port-info-list that have different hashes. Toshiba is especially weird given that it is not even among the changed files, but the gphoto2-port-info-list.c might warrant further look as it actually used _GNU_SOURCE before this change and maybe calls something different.

RReverser avatar Oct 31 '22 03:10 RReverser

but the gphoto2-port-info-list.c might warrant further look as it actually used _GNU_SOURCE before this change and maybe calls something different.

Ah this change is just due to usages of HAVE_GNU_REGEX, but then, it has a fallback to POSIX regular expressions, so I suppose it should be fine? Not sure why it doesn't just always use regcomp...

RReverser avatar Oct 31 '22 03:10 RReverser

As for Toshiba file, looking at assembly diff, it just seems that a lot of addresses have shifted, but no meaningful changes otherwise:

image

RReverser avatar Oct 31 '22 14:10 RReverser

Ah the addresses shift due to usleep:

image

I just checked, if I comment out the 2 usleep calls, both object files before and after are identical.

Since we don't use the result of usleep in that file anyway and don't care about int vs void, this is fine.


So the only meaningful change in functionality from this PR is the GNU vs POSIX regex above. Once that's clearified, this should be good to go.

RReverser avatar Oct 31 '22 14:10 RReverser

i have autoconf 2.71 on my buildsystem at least and would not mind this change, it seems to be a good cleanup

msmeissn avatar Nov 02 '22 15:11 msmeissn

i have autoconf 2.71 on my buildsystem at least and would not mind this change, it seems to be a good cleanup

FWIW I've already switched to _DEFAULT_SOURCE which should be compatible with older autoconf too, plus a bit more consistent across systems.

Can you comment on the regex change above? Are both branches for POSIX and GNU regexes necessary? If not, then after this PR GNU regex branch is no-op and could be removed entirely.

RReverser avatar Nov 02 '22 15:11 RReverser

I do not see how just defining _DEFAULT_SOURCE helps with cross platform compatibility. Could you please help me understand?

I just ran a git grep _DEFAULT_SOURCE on the FreeBSD sources, and the only occurrences are in 3rd party source autotools build systems from contrib/ like unbound, openssh, ncurses, libpcap, ldns, libfido2, dialog, byacc. The FreeBSD libc does not have _DEFAULT_SOURCE. The bugs you mentioned above which this PR should solve were MacOS/OSX libc API compat issues solved by defining _DARWIN_C_SOURCE, not _DEFAULT_SOURCE.

libgphoto2 wants to run on multiple systems, Linux with glibc being a popular one, but certainly not the only one. There is Linux with non-glibc C libraries, there is FreeBSD and other BSDs, there is MacOS/OSX/whatevertheycallit, there is even Windows, and probably some system that does not come to mind right now.

We tried to use mainly C standard conforming code where possible, and in a few places where it makes sense, we use system specific APIs or API extensions for which we may need to define some system specific preprocessor macros.

I see the consistency argument, i.e. how having the same macro definitions in all source files can be useful to have the codebase behave the same in everywhere.

So AC_USE_SYSTEM_EXTENSIONS sounds useful to achieve source code consistency at the expedience of exposing and maybe using a more non-C-standard API without actually intending to, in the expectance that the m4 macro defines an adequate set of defines for every system libgphoto2 is built on.

I do not understand how just defining _DEFAULT_SOURCE helps, especially regarding the MacOS/OSX case.

  • The old way: libgphoto2 sources mostly use C standard APIs, with some exceptions which need to be marked in the source. Build compat issues should be exposed by CI builds.

    Advantage: Source is mostly standard C.

    Disadvantage: Inconsistent APIs in use across libgphoto2 source files, lack of diversity of CI builds fails to expose build compat problems.

    I understand this, and am fine with it.

  • The way originally proposed in this PR: Add AC_USE_SYSTEM_EXTENSIONS, and put #include "config.h" in the front of all compilation units before any system header #include so that the system headers actually see whatever defines might be in config.h.

    Advantage: Consistent API across libgphoto2 sources. This means when we take an API used by one of our source files over to another source file, we do not need change that other source file's feature test macros and can therefore not forget to do it.

    Disadvantage: More libgphoto2 code might use non-C-standard API, potentially making it more difficult to port.

    I understand this, and am fine with it.

  • The PR now: Just globally define _DEFAULT_SOURCE which I have found only in glibc so far. It certainly is not in FreeBSD libc, and while I have not checked the MacOS/OSX libc for _DEFAULT_SOURCE, I do know the MacOS/OSX userland was originally based on FreeBSD. So I cannot see what _DEFAULT_SOURCE does for us here, especially for the MacOS/OSX build issues mentioned in the original PR comment.

    This confuses me.

ndim avatar Nov 02 '22 21:11 ndim

That's... a lot of questions about a small edit :)

I'm happy to revert the _DEFAULT_SOURCE change, the only reason I introduced it is because you said you were concerned about AC_USE_SYSTEM_EXTENSIONS on old autoconf.

Sounds like @msmeissn is okay with bumping autoconf version, and AC_USE_SYSTEM_EXTENSIONS is more flexible as you pointed out, so I can revert that bit.

RReverser avatar Nov 02 '22 23:11 RReverser

I'd still appreciate answers to my questions about regex stuff above though. I've already investigated all the differences as originally requested, and that's the only one that's left under question.

RReverser avatar Nov 02 '22 23:11 RReverser

I'm happy to revert the _DEFAULT_SOURCE change, the only reason I introduced it is because you said you were concerned about AC_USE_SYSTEM_EXTENSIONS on old autoconf.

Sounds like @msmeissn is okay with bumping autoconf version, and AC_USE_SYSTEM_EXTENSIONS is more flexible as you pointed out, so I can revert that bit.

Reverted; looks like I replaced it only in top-level configure.ac and missed the libgphoto2_port one when I did that change anyway.

RReverser avatar Nov 02 '22 23:11 RReverser

I'm happy to revert the _DEFAULT_SOURCE change, the only reason I introduced it is because you said you were concerned about AC_USE_SYSTEM_EXTENSIONS on old autoconf.

Sounds like @msmeissn is okay with bumping autoconf version, and AC_USE_SYSTEM_EXTENSIONS is more flexible as you pointed out, so I can revert that bit.

I just said we needed to discuss it, and Marcus responded: There is no need to bump the autoconf version from 2.69 to 2.71. Marcus wrote that he prepares the release tarball using autoconf 2.71, so the released tarballs contain the AC_USE_SYSTEM_EXTENSIONS code from autoconf 2.71, which should build on OSX OOTB. However, if someone wants to build from git on a system with the older autoconf, that will still work for them. Best of both worlds.

Oh and BTW, we used to have a camlib developer in the 2000s using a fairly old (at the time) system, and we started to be very conservative about build tool requirements to accommodate for their needs. These days though, it might be possible/reasonable to use slightly less ancient build tools these days. Not sure where to draw the border. And this is a topic to discuss a separate GitHub issue. We do not need to bump the requirements for this PR.

ndim avatar Nov 03 '22 19:11 ndim

I'd still appreciate answers to my questions about regex stuff above though. I've already investigated all the differences as originally requested, and that's the only one that's left under question.

Yepp, still on my to do list.

ndim avatar Nov 03 '22 19:11 ndim

These days though, it might be possible/reasonable to use slightly less ancient build tools these days.

That would be wonderful. Especially if it also means we can use various good stuff from C11 and later. But yeah, out of scope here.

RReverser avatar Nov 03 '22 19:11 RReverser

Just a quick update on what I have been doing, both to remind me of where I was when I continue, and to let you know that I am not just sitting on this.

  • Looking back into the non-standard usleep thing, it turns out I have written something using nanosleep in 2006 but did not use it consistently over the whole codebase. Therefore, a few years later, it was eliminated again. Well... let's stick with usleep for the time being.

  • The regex stuff in gphoto2-port-info-list.c is interesting. Just from reading the code, I do not see a reason why we have both a GNU and a non-GNU regex implementation of the same thing. I want to check both the history of that file to find out, and whether some superficial differences (ignore case flag in one but not the other) actually makes a difference.

  • Then there are a few things in the original first commit of this PR which in some compilation units removes some #defines but does not ensure that include "config.h" is actually present (e.g. examples/config.c). However, we do not have an OSX build in our CI builds at this time, so removing the #define _DARWIN_FOO will not appear in the CI build.

ndim avatar Nov 04 '22 02:11 ndim

  • but does not ensure that include "config.h" is actually present

Oh good catch!

RReverser avatar Nov 04 '22 03:11 RReverser

I would argue that the example programs in examples/ document the most proper way to write a program and should therefore not rely on libgphoto2's config.h, as the example programs should be as explicit as possible. I will go through the examples to verify that.

Also, I want to verify this PR has working OSX and mingw cross-compile builds before I merge this PR, and maybe even native Windows builds.

This PR set out to improve the situation on e.g. OSX, and so it should not make it worse.

ndim avatar Nov 04 '22 10:11 ndim