mini_portile
mini_portile copied to clipboard
mkmf_config support for pkg-config files with multiple library directories
As mentioned in https://github.com/mudge/re2/pull/138#discussion_r1521442743, the re2.pc
pkg-config file generated by RE2 contains libraries from two directories: RE2 itself and its sole dependency, Abseil (note I’ve populated PKG_CONFIG_PATH
prior to this so that it prefers the Abseil in ports
over my system install):
$ pkg-config --libs --static re2.pc
-L/Users/mudge/Projects/re2/ports/arm64-apple-darwin22/libre2/2024-03-01/lib -L/Users/mudge/Projects/re2/ports/arm64-apple-darwin22/abseil/20240116.1/lib -pthread -lre2 -labsl_flags_internal -labsl_flags_marshalling -labsl_flags_reflection -labsl_flags_private_handle_accessor -labsl_flags_commandlineflag -labsl_flags_commandlineflag_internal -labsl_flags_config -labsl_flags_program_name -labsl_cord -labsl_cordz_info -labsl_cord_internal -labsl_cordz_functions -labsl_cordz_handle -labsl_crc_cord_state -labsl_crc32c -labsl_crc_internal -labsl_crc_cpu_detect -labsl_raw_hash_set -labsl_hash -labsl_city -labsl_bad_variant_access -labsl_low_level_hash -labsl_hashtablez_sampler -labsl_exponential_biased -labsl_bad_optional_access -labsl_str_format_internal -labsl_synchronization -labsl_graphcycles_internal -labsl_kernel_timeout_internal -labsl_stacktrace -labsl_symbolize -labsl_debugging_internal -labsl_demangle_internal -labsl_malloc_internal -labsl_time -labsl_civil_time -labsl_strings -labsl_strings_internal -labsl_string_view -labsl_base -labsl_spinlock_wait -labsl_int128 -labsl_throw_delegate -labsl_raw_logging_internal -labsl_log_severity -labsl_time_zone
At the moment, this doesn’t feel compatible with MiniPortile2’s new mkmf_config(static:)
functionality as it assumes the .pc
file only specifies libraries within its own directory (see https://github.com/flavorjones/mini_portile/blob/52fb0bc41c89a10f1ac7b5abcf0157e059194374/lib/mini_portile2/mini_portile.rb#L319) and that there is a single, entrypoint pc
file for a library (Abseil ships with over 180 individual ones for each library).
What I actually want here is to turn all of the -l
flags into static, absolute paths by looking for the matching file in each of the library directories specified by the two -L
flags, e.g.
-lre2 -labsl_flags_internal
Becomes:
/Users/mudge/Projects/re2/ports/arm64-apple-darwin22/libre2/2024-03-01/lib/libre2.a /Users/mudge/Projects/re2/ports/arm64-apple-darwin22/abseil/20240116.1/lib/libabsl_flags_internal.a
Would it be possible to extend the API to support this use case (potentially also including setting PKG_CONFIG_PATH
)?
Looking at https://github.com/flavorjones/mini_portile/blob/52fb0bc41c89a10f1ac7b5abcf0157e059194374/lib/mini_portile2/mini_portile.rb#L327-L334, perhaps my particular issue is because Abseil comes with over 180 individual .pc
files so there isn't a single file to reference and pre-populate $MINI_PORTILE_STATIC_LIBS
with before loading RE2's re2.pc
(viz. if I were able to use mkmf_config(static:)
with Abseil first, perhaps using it with RE2 would be able to generate the correct static flags already).
That said, perhaps changing mkmf_config(static:)
to look up libraries in every directory in $LIBPATH
when trying to generate static links would cover what I need here?
Thank you for opening this! Hoping to looking into it in the next week.
FWIW we explicitly restrict our search for static replacements of libraries to the two lib
directories of our recipes in re2 but I wonder if it would be as safe (and much more general) to use all of $LIBPATH
especially if we’re prepending every new -L
flag from the .pc
file to it.
In case it is of any use, I had some time today to try and spike a mix of how your mkmf_config
works and how we're currently handling it in re2 and came up with the following:
ENV["PKG_CONFIG_ALLOW_SYSTEM_CFLAGS"] = "t"
pkg_config_paths = [
"#{abseil_recipe.path}/lib/pkgconfig",
"#{re2_recipe.path}/lib/pkgconfig"
]
# Set up pkg-config to prefer our vendored RE2 and Abseil
pkg_config_paths.prepend(ENV['PKG_CONFIG_PATH']) if ENV['PKG_CONFIG_PATH']
ENV['PKG_CONFIG_PATH'] = pkg_config_paths.join(File::PATH_SEPARATOR)
pc_file = File.join(re2_recipe.path, 'lib', 'pkgconfig', 're2.pc')
# Collect static -L directories to resolve static libraries
library_dirs = xpopen(['pkg-config', '--libs-only-L', '--static', pc_file], err: %i[child out], &:read).strip
static_library_dirs = library_dirs.shellsplit.map { |flag| flag.sub(/\A-L/, "") }
puts "Static library dirs are #{static_library_dirs.inspect}"
# Convert -l flags to static links within the collected static directories where possible
libflags = xpopen(['pkg-config', '--libs-only-l', '--static', pc_file], err: %i[child out], &:read)
.shellsplit
.map do |flag|
next flag unless flag.start_with?("-l")
static_lib = "lib#{flag[2..]}.#{$LIBEXT}"
static_lib_dir = static_library_dirs.find { |dir| File.exist?(File.join(dir, static_lib)) }
next flag unless static_lib_dir
File.join(static_lib_dir, static_lib)
end
# Prepend all flags (including the static replacements) to $libs
$libs = [libflags, $libs].join(" ").strip
# Prepend the original -L flags to LDFLAGS otherwise the final linking step fails with "library 're2' not found"
$LDFLAGS = [library_dirs, $LDFLAGS].join(" ").strip
# Prepend INCFLAGS
incflags = xpopen(['pkg-config', '--cflags-only-I', pc_file], err: %i[child out], &:read).strip
$INCFLAGS = [incflags, $INCFLAGS].join(" ").strip
# Append CFLAGS and CXXFLAGS
cflags = xpopen(['pkg-config', '--cflags-only-other', pc_file], err: %i[child out], &:read).strip
$CFLAGS = [$CFLAGS, cflags].join(" ").strip
$CXXFLAGS = [$CXXFLAGS, cflags].join(" ").strip
Inspecting the resulting re2.bundle
with otool
on macOS gives the following:
$ otool -L lib/re2.bundle
lib/re2.bundle:
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1700.255.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)
I'm not sure what the significance of having to set LDFLAGS
as well as libs
is however, perhaps you know better?
Update: it might be worth adding that we’re currently relying on dir_config
to specify the location of RE2 (which seems to populate $LIBPATH
with the lib directory and $CPPFLAGS
with -I
flags) but I’m hoping that’ll no longer be necessary if we can set things up right.
Here's another version of the last few lines that only sets $libs
, $LIBPATH
and $INCFLAGS
and compiles successfully:
# Set $libs with all the flags including our static replacements
$libs = [libflags, $libs].join(" ").strip
# Tell Ruby where to find the libraries
$LIBPATH = static_library_dirs | $LIBPATH
# Add INCFLAGS
incflags = xpopen(['pkg-config', '--cflags-only-I', pc_file], err: %i[child out], &:read).strip
$INCFLAGS = [incflags, $INCFLAGS].join(" ").strip
I’ve now distilled this into https://github.com/mudge/re2/pull/138/commits/ac14e8f9af0632f18fe12c568795cc9a9bcd6a7f
It’s worth noting that preferring $libs
over $LDFLAGS
seems to cause a problem if you have the library in Ruby’s RbConfig::CONFIG['exec_prefix']
(a bug @stanhu found when working on re2) so I’ve since switched to setting $LDFLAGS
directly which seems to resolve the problem. It’d be good to dig into why this happens though.
@mudge thanks for all your work here. i've been heads-down during my OSS time trying to work on a few urgent things and haven't been able to get back to this yet. this topic is at the top of my backlog, though, and I hope to get to it in the next week or two.
You’re welcome, there’s no rush from my side. I had some time this weekend to spend on it hence the over-communication both here and on my (now draft) PR in re2.
The main thing I’d like to understand more is the significance between $LDFLAGS
and $libs
and $LIBPATH
for the library search paths and individual libraries as well as $INCFLAGS
and $CPPFLAGS
for include search paths (eyeballing the difference in them between my main
branch and the new PR hasn’t been especially enlightening) and why it fixes the issue of libraries in exec_prefix
being linked ahead of the static ones.
It has been tricky because so many MakeMakefile
methods indirectly mutate one or more of the globals and that isn’t always easy to spot (e.g. have_library
calling dir_config
).
Quick update: I think I got to the bottom of why I saw a regression in re2 and it was to do with using have_library("re2")
as it adds -lre2
to LIBS
in the resulting Makefile
(even though ldflags
already has all the static libraries) which makes it possible for the gem to pick up any libre2.a
s elsewhere in the search path.
Having churned on this a bit, I ended up releasing re2 2.10.0 with the refactored pkg-config logic.
In summary:
- The previous implementation was mostly taken from Nokogiri’s
extconf.rb
and useddir_config
to populate$CPPFLAGS
with the vendored dependencies’include
paths (and$LIBPATH
with theirlib
paths) before parsing the result ofpkg-config --libs --static path/to/lib/pkgconfig/re2.pc
, prepending$LIBPATH
with any vendored dependencies’lib
paths specified in-L
flags and replacing all-l
flags that could be found in a hard-coded list oflib
paths with absolute (and therefore static) paths before adding them to$LDFLAGS
. - I updated the code to much more closely resemble MiniPortile2’s
mkmf_config
and Ruby’s ownMakeMakefile#pkg_config
and instead preferpkg-config
’s more specific--libs-only-L
,--libs-only-l
,--libs-only-other
,--cflags-only-I
,--cflags-only-other
to extract thelib
paths without adding them to$LIBPATH
, replace-l
flags found in those directories with absolute, static paths, adding them to$libs
(not$LDFLAGS
), add any other linker flags (e.g.-pthread
) to$LDFLAGS
, add include paths to$INCFLAGS
and everything else to$CFLAGS
and$CXXFLAGS
respectively. - The key thing here is that we never call
dir_config
either directly or indirectly throughhave_library
, etc. when linking statically to avoid accidentally adding a dynamic-lre2
flag anywhere in the resultingMakefile
.
This way, I hope that we include the bare minimum necessary and use the “highest” possible level of abstraction (e.g. the more intention-revealing $INCFLAGS
vs the more generic $CPPFLAGS
) for both the compilation ($INCFLAGS
, $CFLAGS
, $CXXFLAGS
) and linking ($LDFLAGS
, $libs
) without adding unnecessary directories to the $LIBPATH
and minimising the risk of linking a library dynamically rather than statically.
I’m not sure our strategy of searching for static libraries in every -L
flag returned by pkg-config
is generalisable for your API but hopefully some of this is useful if only to confirm your existing design decisions.
If nothing else, it’d be quite nice to delete my copy of minimal_pkg_config
and rely on yours if it was made part of MiniPortile2’s public API.