OpenXLSX icon indicating copy to clipboard operation
OpenXLSX copied to clipboard

Improve portability for external pugixml

Open bansan85 opened this issue 5 months ago • 18 comments

I ported openxlsx to vcpkg. I needed to remove local 3rd party (pugixml and nowide).

I needed to patch some part of OpenXLSX to use pugixml without the .cpp source.

If you are interested, you may take a look at https://github.com/microsoft/vcpkg/pull/46633/files (I didn't create pull request because I noticed you don't like it :wink: )

bansan85 avatar Jul 29 '25 10:07 bansan85

Hi @bansan85 - it appears that our works have overlapped - in the development-aral branch, I am preparing a new version that has the kind of portability for use of a pugixml library on the system which I believe you also needed.

Would you have a look at the development branch?

I need to finish my code review before I can merge it into master, this is why it's not been merged yet: The use of forward declarations (similar to pImpl idiom) introduced some pitfalls where I need to move defaulted constructors, destructors and assignment operators from the header files. And since this only triggers a compiler error when a class is instantiated, and the demos do not instantiate all classes of the library - I need to do this thorough code review.

Nevertheless, the internal use of pugixml is now hidden from the library API, and the CMakeLists.txt now has an option OPENXLSX_ENABLE_LIBPUGIXML to also compile OpenXLSX against libpugixml. Also, use of nowide should be hidden in the headers/detail/OpenXLSXFileSystemTools.hpp functions, which is not exposed to the library API.

(I didn't create pull request because I noticed you don't like it 😉 )

Commendable, thank you very much ;)

aral-matrix avatar Jul 29 '25 11:07 aral-matrix

I didn't build, I just read the CMakeLists.txt changes.

Why don't you use find_package instead of

        target_link_libraries(OpenXLSX
                PRIVATE
                -lpugixml)

because you'll miss its include path with target_include_directories.

I checked all other changes, you already made them :)

bansan85 avatar Jul 29 '25 14:07 bansan85

Why don't you use find_package instead of

Short answer: Because I am really clueless in terms of cmake - the cmake configuration was initially created by @troldal and I have been "learning by doing" when trying to keep changes in source code / linking logic working in cmake.

Can you tell me how that line should look like in your opinion? I'll try it out then.

aral-matrix avatar Jul 29 '25 14:07 aral-matrix

I cloned the dev branch.

You should update the embedded Catch library. I couldn't build the tests due to https://github.com/catchorg/Catch2/issues/2421.

I also couldn’t build the tests (textSLCell.cpp) because of the use of a deleted copy constructor.

To use find_package, do the following (See https://github.com/microsoft/vcpkg/pull/46633/files for the full example):

diff --git a/OpenXLSX/CMakeLists.txt b/OpenXLSX/CMakeLists.txt
index 9e6b0e3..09ea155 100644
--- a/OpenXLSX/CMakeLists.txt
+++ b/OpenXLSX/CMakeLists.txt
@@ -58,7 +58,7 @@ if (NOT OPENXLSX_ENABLE_LIBZIP)
 endif ()

 if (OPENXLSX_ENABLE_LIBPUGIXML)
-    # nothing to do: pugixml headers should already be in include path, and libpugixml is added in the library link statements below
+    find_package(PugiXML REQUIRED)
 else()
     add_library(PugiXML INTERFACE IMPORTED)
     target_include_directories(PugiXML SYSTEM INTERFACE $<BUILD_INTERFACE:${CMAKE_CURRENT_LIST_DIR}/external/pugixml/>)
@@ -155,8 +155,8 @@ if ("${OPENXLSX_LIBRARY_TYPE}" STREQUAL "STATIC")
             $<BUILD_INTERFACE:${CMAKE_CURRENT_BINARY_DIR}>)     # For export header
     if (OPENXLSX_ENABLE_LIBPUGIXML)
         target_link_libraries(OpenXLSX
-                PRIVATE
-                -lpugixml)
+                PUBLIC
+                pugixml::pugixml)
     else()
         # pugixml is used in header only mode - so whatever this does...
         target_link_libraries(OpenXLSX
@@ -199,8 +199,8 @@ if ("${OPENXLSX_LIBRARY_TYPE}" STREQUAL "SHARED")
             $<BUILD_INTERFACE:${CMAKE_CURRENT_BINARY_DIR}>)     # For export header
     if (OPENXLSX_ENABLE_LIBPUGIXML)
         target_link_libraries(OpenXLSX
-                PRIVATE
-                -lpugixml)
+                PUBLIC
+                pugixml::pugixml)
     else()
         # pugixml is used in header only mode - so whatever this does...
         target_link_libraries(OpenXLSX

And also update OpenXLSXConfig.cmake (you need to add find_dependency(... CONFIG) in this file when your CMakeLists.txt will use find_package(... REQUIRED)):

include(CMakeFindDependencyMacro)
if(@OPENXLSX_ENABLE_LIBPUGIXML@)
    find_dependency(pugixml CONFIG)
endif()
if(WIN32)
    find_dependency(nowide CONFIG)
endif()

include("${CMAKE_CURRENT_LIST_DIR}/OpenXLSXTargets.cmake")

And update CMakeLists.txt:

 configure_file(OpenXLSXConfig.cmake
         "${CMAKE_CURRENT_BINARY_DIR}/OpenXLSX/OpenXLSXConfig.cmake"
        @ONLY
         )

install(
        FILES
        "${CMAKE_CURRENT_BINARY_DIR}/OpenXLSX/OpenXLSXConfig.cmake"
        "${CMAKE_CURRENT_BINARY_DIR}/OpenXLSX/OpenXLSXConfigVersion.cmake"
        DESTINATION ${ConfigPackageLocation}
)

You should add an option OPENXLSX_ENABLE_LIBNOWIDE and use find_package(nowide REQUIRED) / target_link_libraries(... PUBLIC nowide::nowide) and update OpenXLSXConfig.cmake.

When using target_link_libraries with a library target, always use PUBLIC for non-header-only dependencies. This is important for static linkage.

You may have an application that links to OpenXLSX, which needs pugixml. With static linkage, when linking the application against OpenXLSX, the linker will also need pugixml, which will be missing if you use a PRIVATE link.

In CMakeLists.txt, the variable dir is never set I think :

# Install interface headers
install(
        FILES ${OPENXLSX_CXX_INTERFACE_HEADERS}
        DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}/OpenXLSX/${dir}
)

Let me know if everything is clear.

bansan85 avatar Jul 30 '25 08:07 bansan85

Let me know if everything is clear.

Everything is not clear right now - whoa, that's a lot of changes. I will have a look and see if I can make sense of them. I don't want to just copy lines you suggest without knowing what they do. Did I mention I hate cmake? It's really hot garbage in terms of usability & documentation.

You should update the embedded Catch library. I couldn't build the tests due to https://github.com/catchorg/Catch2/issues/2421.

To be honest, I haven't ever used that and the tests are really really outdated anyways, because in the past year, I added a lot of functionality to the library, and a lot has changed, and not a single test was updated or written newly. So there's not really any point in trying to compile the tests right now, if you ask me. I'll focus on the libpugixml recommendations first.

aral-matrix avatar Jul 30 '25 09:07 aral-matrix

No problem for the tests.

CMake is easy to start with, but takes many painful hours to master. It needs lots of workaround.

bansan85 avatar Jul 30 '25 09:07 bansan85

I am currently working on first getting nowide out of the way, because your proposed solution doesn't align with the philosophy that OpenXLSX comes shipped with the required dependencies, and when adding compatibility for system installed libraries, the shipped source code modules should remain. And the line

find_package(nowide CONFIG REQUIRED)

simply breaks my setup (when testing nowide), so I am going for a solution that ideally uses CPMFindPackage triggered by a TBD flag, while also being able to work directly with the boost nowide headers installed on the system.

aral-matrix avatar Jul 30 '25 10:07 aral-matrix

Hi again @bansan85, would you be so kind to test the latest commit 39bd1b0671ae90abf56fd8118835e89bdcd122f3 in terms of enabling an external dependency on boost nowide?

I added two new cmake options:

  • added to cmake and GNU make: option OPENXLSX_FORCE_NOWIDE - set to ON to use force the use of nowide even on non-Windows systems (this flag is for testing purposes)
  • added to cmake and GNU make: option OPENXLSX_ENABLE_LIBBOOST_NOWIDE - set to ON to use nowide from an installed libboost

I tried my best to integrate it with cmake as well as my own GNU Makefile. If this works as intended for your packaging attempts, then I will address the pugixml changes you ask for next.

aral-matrix avatar Jul 30 '25 12:07 aral-matrix

This looks good. You do:

if (OPENXLSX_ENABLE_NOWIDE)
    if (OPENXLSX_ENABLE_LIBBOOST_NOWIDE)
        find_package(Boost 1.74 REQUIRED COMPONENTS nowide) # on Linux at least, this requires libboost-nowide-dev to be installed
    else()
        add_library(NoWide INTERFACE IMPORTED)
        target_include_directories(NoWide SYSTEM INTERFACE $<BUILD_INTERFACE:${CMAKE_CURRENT_LIST_DIR}/external/nowide/>)
    endif()
endif()

But for the linker, you do:

    if (OPENXLSX_ENABLE_NOWIDE)
        if (NOT OPENXLSX_ENABLE_LIBBOOST_NOWIDE)
            # boost nowide is header-only, a "library" only makes sense in the context of the nowide header files shipped with OpenXLSX
            target_link_libraries(OpenXLSX
                    PRIVATE
                    $<BUILD_INTERFACE:NoWide>)
        endif()
    endif ()

The linker won't know what to do if OPENXLSX_ENABLE_NOWIDE and if OPENXLSX_ENABLE_LIBBOOST_NOWIDE. You need to do like in CMake examples. I suggest:

    if (OPENXLSX_ENABLE_NOWIDE)
        if (OPENXLSX_ENABLE_LIBBOOST_NOWIDE)
            target_link_libraries(OpenXLSX
                    PRIVATE
                    Boost::nowide)
        else()
            # boost nowide is header-only, a "library" only makes sense in the context of the nowide header files shipped with OpenXLSX
            target_link_libraries(OpenXLSX
                    PRIVATE
                    $<BUILD_INTERFACE:NoWide>)
        endif()
    endif ()

bansan85 avatar Jul 30 '25 14:07 bansan85

Done, thank you! I forgot that there are some versions of nowide that are not header-only. I had actually intended to leave the linker without instructions because I thought there was nothing to do.

I'll hopefully get around to implementing your pugixml suggestions for the CMakeFile in the next days.

aral-matrix avatar Jul 30 '25 14:07 aral-matrix

If nowide may not be header-only, you will probably need to use PUBLIC in target_link_libraries to allow static linkage.

bansan85 avatar Jul 30 '25 15:07 bansan85

If nowide may not be header-only, you will probably need to use PUBLIC in target_link_libraries to allow static linkage.

Oh yeah that was something I did not understand in your original suggestion: Is this really an issue for functions that are hidden from the (OpenXLSX) library API? I was under the impression that a statically linked OpenXLSX library would have the dependencies also linked into the library binary file, so that applications using OpenXLSX would not need the information about linkage of library dependencies?

Both nowide and pugixml (also the zip implementations IIRC) are hidden from the public library interface of OpenXLSX.

aral-matrix avatar Jul 30 '25 15:07 aral-matrix

Hi again! I commited 17aee0e496fb19fe30906693086d69ee7c8690ee which implements find_package for LibZip and PugiXML. However, I just wasted 3+ hours of my life trying to find out how the hell to tell cmake's find_package to ignore unneeded components of LibZip: Could you @bansan85 or anyone tell me how to tell find_package to shut up about missing components of LibZip that this library does not need at all? Relevant line is here: https://github.com/troldal/OpenXLSX/blob/development-aral/OpenXLSX/CMakeLists.txt#L60 - the QUIET option (and also OPTIONAL_COMPONENTS) turned out to be absolutely useless :(

aral-matrix avatar Jul 30 '25 22:07 aral-matrix

If nowide may not be header-only, you will probably need to use PUBLIC in target_link_libraries to allow static linkage.

Oh yeah that was something I did not understand in your original suggestion: Is this really an issue for functions that are hidden from the (OpenXLSX) library API? I was under the impression that a statically linked OpenXLSX library would have the dependencies also linked into the library binary file, so that applications using OpenXLSX would not need the information about linkage of library dependencies?

Both nowide and pugixml (also the zip implementations IIRC) are hidden from the public library interface of OpenXLSX.

I encountered the PRIVATE linkage issue a year ago. I had a library that used HDF5 to save data. The library only exported a saveToFile function in the header, but the source code used HDF5 internally. So I linked HDF5 privately. When I tried to statically link my main app with the library, the linker complained about missing HDF5 symbols.

Another people have this problem: https://discourse.cmake.org/t/static-library-dependencies/9068/3

Of course, you can keep it as PRIVATE. But just keep in mind that someday, someone might complain that they can't link OpenXLSX with their app using static linkage.

Look at the following diagram.

Image

bansan85 avatar Jul 31 '25 07:07 bansan85

  • libzip cmake files

Hi again! I commited 17aee0e which implements find_package for LibZip and PugiXML. However, I just wasted 3+ hours of my life trying to find out how the hell to tell cmake's find_package to ignore unneeded components of LibZip: Could you @bansan85 or anyone tell me how to tell find_package to shut up about missing components of LibZip that this library does not need at all? Relevant line is here: https://github.com/troldal/OpenXLSX/blob/development-aral/OpenXLSX/CMakeLists.txt#L60 - the QUIET option (and also OPTIONAL_COMPONENTS) turned out to be absolutely useless :(

When you can't find what's wrong, it's better to read CMake files from the library.

Read /usr/lib/x86_64-linux-gnu/cmake/libzip/libzip-config.cmake

There is an include libzip-targets.cmake

You can see that zipcmp / zipmerge / ziptool is hardcoded. So they are mandatory.

  • Custom find libzip

I noticed there is a file Examples/cmake/FindLIBZIP.cmake. This file uses pkgconfig instead of cmake to configure libzip.

In OpenXLSX/CMakeLists.txt, you can replace:

    find_package(LibZip 1.7
        CONFIG
        COMPONENTS zip
        QUIET OPTIONAL_COMPONENTS zipcmp zipmerge ziptool # requires the following tools to be installed: zipcmp zipmerge ziptool
    )

by

    include(${CMAKE_CURRENT_SOURCE_DIR}/../Examples/cmake/FindLIBZIP.cmake)

But it's always better to use CMake instead of pkgconfig. I suggest you to fill an issue to https://github.com/nih-at/libzip to make zipcmp / zipmerge / ziptool optional in CMake for find_package.

  • CI

I noticed you don't have a Github workflow. Would you like I write a basic one to check if OpenXLSX build ? You will be able to add more various config if you want.

bansan85 avatar Jul 31 '25 08:07 bansan85

Look at the following diagram.

Okay, I also found this issue: https://discourse.cmake.org/t/static-library-with-static-library-dependencies-propagates-its-dependencies-when-exported-as-package/7468

It is weird to me - and news, really - that a static linkage exports internal symbols of dependencies to be visible to an application using that library :/

For now, I'd keep the cmake PRIVATE setting, I would like to have a specific error one day to confirm & better understand the issue.

Read /usr/lib/x86_64-linux-gnu/cmake/libzip/libzip-config.cmake There is an include libzip-targets.cmake You can see that zipcmp / zipmerge / ziptool is hardcoded. So they are mandatory.

I did that yesterday, but my knowledge of cmake is not enough to understand the relevant line as "including them is mandatory". Or maybe, the cmake syntax is just that bad ;) I know that linking with -lzip works perfectly fine, without installing these other tools, so it seems the cmake config file has done some hot garbage there.

I suggest you to fill an issue to https://github.com/nih-at/libzip to make zipcmp / zipmerge / ziptool optional in CMake for find_package.

Yes, that was also my plan after this mess yesterday, should no one be able to tell me that I missed something obvious. I will do that in the next days.

I noticed you don't have a Github workflow. Would you like I write a basic one to check if OpenXLSX build ? You will be able to add more various config if you want.

Thank you for the offer, but I'll politely decline (hell no! :). The only reason I am still active on github is to support OpenXLSX, but I despise everything coming out of the silicon valley or Redmond with a passion, and I do not wish to increase my links with github.

aral-matrix avatar Jul 31 '25 09:07 aral-matrix

Read /usr/lib/x86_64-linux-gnu/cmake/libzip/libzip-config.cmake There is an include libzip-targets.cmake You can see that zipcmp / zipmerge / ziptool is hardcoded. So they are mandatory.

I did that yesterday, but my knowledge of cmake is not enough to understand the relevant line as "including them is mandatory". Or maybe, the cmake syntax is just that bad ;) I know that linking with -lzip works perfectly fine, without installing these other tools, so it seems the cmake config file has done some hot garbage there.

Targets are defined (hardcoded) in libzip-targets-none.cmake file.

And this file is imported in libzip-config.cmake with lines:

file(GLOB _cmake_config_files "${CMAKE_CURRENT_LIST_DIR}/libzip-targets-*.cmake")
foreach(_cmake_config_file IN LISTS _cmake_config_files)
  include("${_cmake_config_file}")
endforeach()

I added some message for foreach(_cmake_target IN LISTS _cmake_import_check_targets) and foreach(_cmake_file IN LISTS "_cmake_import_check_files_for_${_cmake_target}"). There is an associated file for each target:

libzip::zip : /usr/lib/x86_64-linux-gnu/libzip.so.4.0 libzip::zipcmp : /usr/bin/zipcmp libzip::zipmerge : /usr/bin/zipmerge libzip::ziptool : /usr/bin/ziptool

I hope you won't have more problems with CMake.

bansan85 avatar Jul 31 '25 10:07 bansan85

I hope you won't have more problems with CMake.

It appears that the latest release of libzip took care of the "independent components" thing. So starting with libzip version 1.11.4 or higher, find_package shall work without installing the unneeded components.

aral-matrix avatar Jul 31 '25 18:07 aral-matrix