VulkanSceneGraph icon indicating copy to clipboard operation
VulkanSceneGraph copied to clipboard

Inability to compile vsg with gslang support in build environments without network access

Open psi29a opened this issue 7 months ago • 19 comments

Describe the bug glslang support in vsg seems tied to a specific fork of glslang (upstream + patches). This is problematic for several reasons:

  • packaging vsg downstream with glslang support is impossible as build environments have no network access
  • downstream's glslang do not have those patches
  • projects that require glslang support can't also be packaged due to lack of glslang support
  • vsg's cmake requires git to pull in the fork then hooks into it via cmake in an odd way

Expected behavior

vsg should be making use of the system glslang, and if not available but still required, then the project should make use of cmake's FetchContent: https://cmake.org/cmake/help/latest/module/FetchContent.html FetchContent is more secure as it also requires a hash of the release or commit version you wish to download to limit supply-chain attacks.

Additional context How necessary are the patches to glslang and why are they needed? It is possible to use FetchContent to apply patches so there is no need to fork.

Being able to package vsg with glslang support Debian and Ubuntu is currently blocked.

Building projects, like vsgopenmw, using flatpak are also blocked as they also do not have network access while in the build environment.

psi29a avatar Nov 26 '23 10:11 psi29a

The integrated of glslang with patches in the way I've done it is essentially the best of bad bunch of possible solutions. FetchContent has it's own problems, depending on 3rd party sources for glslang also introduced a range of problems due the inconsistent way glslang version has been done historically.

The patches that I applied to glslang were all warning fixes as gslang itself is built with such lax warnings when built as part of the VSG it generates hundreds of warnings making it impossible to spot any within the VSG itself. My plan has to be upstream these warnings fixes but glslang process for submissions meant I couldn't quickly jump through the hoops to get a submission - I have remained so busy since I was forced to make these fixes that I haven't had a chance to follow up.

I don't know if glslang version available by package managers has improved since I was forced to move the build of glsalng into the core VSG itself, so there is chance things have improved, but there is good chance that it's still a complete dogs dinner and trying to rely upon it would cause far more "BUGS" that it fixes.

In my ideal world an alternative to glslang would magically appear that doesn't have all the problems that it introduces. Happy to take suggestions, but all those chip in suggestions need to be aware that we've tried lots of different approach and there are failures that happen on different build combinations, what works for one set of users breaks for another. We have to find a solution that works for more users without over-burdening myself and others with an endless tasks of wake-a-mole of problems that one dependency introduces.

robertosfield avatar Nov 26 '23 19:11 robertosfield

Could we instead use FetchContent, apply your patches (so you can still maintain them as needed and target a specific release of gslang), and then build? This seems to me more manageable solution then maintain a fork? :)

psi29a avatar Nov 27 '23 14:11 psi29a

All the things people have told me are bad about FetchContent before (many of which are optional, e.g. if you're concerned about the network access, there's a command-line argument to tell CMake where you've already fetched it to) are the same or worse with the approach taken. I also suspect some/all of the patches could be dropped if ExternalProject was used instead of FetchContent as it does everything within its own scope, so it's more similar to dealing with a precompiled system library.

AnyOldName3 avatar Nov 27 '23 14:11 AnyOldName3

I don't recall off the top of my head why FetchContent failed, I tried it and it had it's own suit of problems that forced me to opt for doing a "roll your own" equivalent of FetchContent to fix the problems.

The solution I've ended up was the result of trying all the standard approaches to working with glslang, every single approach broke for different sets of users, I could never get it working for all users.

A big part of why this is a mess is glslang itself. glslang is evolving so there is chance that they've begun moving towards support use as library better and fixing the CMake side so it integrates with repo's and building in different ways that users need. I don't have the time right now to dive into the glslang quagmire as I know it could well suck days of work and only end up fixing build for some users but breaking it for others.

I say this because I've gone round and round over this way too many times since the inception of the VSG 5 years ago. It's been a horrible experience needing glslang to compile shaders to SPIRV.

As a general note, I consider the VSG developing too fast to be distributed as a stable part of 3rd party repositories. There will come a point in the future that the VSG feature set is mature enough that a high % of users can make do with the older versions available in 3rd party repositories, but we aren't there yet.

For now I'd recommend that projects build the VSG themselves, picking either the latest release or master.. We've put effort into make the VSG play nicely with cmake so you can use it as a submodule, via FetchContent or ExternalProject, this route insulates projects from being dependent on how quickly 3rd party repositories start support software.

robertosfield avatar Nov 27 '23 14:11 robertosfield

Would you mind if I took it for a spin? I've already invested a bit of time trying to get things working, with my own fork of your fork. :)

As I said, providing a generic linux build, like flatpak, prevents network use while building. So even projects who use VSG are running into problems, which was why I raised the issue in the first place. :)

psi29a avatar Nov 27 '23 14:11 psi29a

You are welcome to play.

Getting things to work on one system may well produce different results than once you start trying all the different combinations that end users have - this is where I came a cropper many times - trying something that worked well on all the systems I had to had, but then checked it then different users report problems, then you iterate.

So... I am concerned that you could spend time reinventing the same problems that I've come across, but just not knowing problems exist until it gets widely tested. I don't have time to help out with going around this merry-go-round at this point.

If the glslang team have changed things for the better, then that's really the best time to try things out afresh. So rather fork and fork, I'd see if there is value in going back to glslang main branch to see if they've improved their cmake config support and versioning. Ideally I'd like to have glslang solve all the problems associated with using it so we can simplify our end.

robertosfield avatar Nov 27 '23 15:11 robertosfield

Looking that glslang github repo: https://github.com/KhronosGroup/glslang

First item on the NEWS list is:

  1. C++17 (all platforms) and Visual Studio 2019 (Windows) are now required. This change was driven by the external dependency on SPIRV-Tools.

To compile without pulling in SPIRV-Tools I had to add -DENABLE_OPT=0 on the cmake line, so this worked:

~/3rdParty/glslang/build$ cmake .. -DCMAKE_INSTALL_PREFIX=~/ExperimentalInstall -DENABLE_OPT=0

It then builds and installs OK. I haven't tried against the VSG, but such changes investigating what has changed. These changes will take a while before they feed through to the main 3rd repos, but it looks like the VulkanSDK will be tracking these changes more rapidly.

Perhaps we could set a requirement this modern glslang version, but this would mean that 3rd repos will be the sticking point, but they are anyway as they have older and problematic versions of glslang.

robertosfield avatar Nov 27 '23 18:11 robertosfield

This morning I got a notifaction from a thread on glslang Issue I posted to about the broken cmake config support:

The Issue: https://github.com/KhronosGroup/glslang/issues/2570#issuecomment-1830897326

The PR that hey suggest fixes it: https://github.com/KhronosGroup/glslang/pull/2989

However, the date on the PR is August 2022 so pre-dates when I had to abandon pulling in glslang via find_package. However, the PR might not made it upstream at the point I was doing the last round of attempts to fix the glslang dependency.

It may be worth trying find_package once more and specifying a recent glslang version, We'd need to figure out what is the minimum.

robertosfield avatar Nov 29 '23 08:11 robertosfield

I have tested find_package on macos and linux without issue.

Then perhaps this is enough?

If not found, should we fall back to your git solution for the time being and leave the fetch_content for another PR? That would at least get a few issues unstuck.

psi29a avatar Dec 04 '23 12:12 psi29a

The challenge is that historically the some systems may or may not have find_package, the cmake config files not installed, or they would be installed and broken - this means just testing a couple of platform combinations doesn't mean it's works reliably everywhere.

Do you know what the sitation with the VulkanSDK on Windows now? The VulaknSDK on non Windows systems was installed the cmake config support for glslang but the Windows version of the SDK wasn't install any cmake config support. Perhaps the Windows VulkanSDK has now been fixed, if so we'll need to find a way of checking the VulkanSDK version.

This is just a small snapshot of the issues I've struggled with glsang since the inception of the VSG project.

robertosfield avatar Dec 04 '23 12:12 robertosfield

I can give it a spin on Windows if someone sends me a link to a branch.

AnyOldName3 avatar Dec 04 '23 15:12 AnyOldName3

Without needing to try anything, though, I can see that the Vulkan SDK for Windows does include glslang libraries and headers, but no CMake config to go with them (or for anything else it includes by default, either). Looking at their issue tracker, it's never even been reported as a problem, so it's not a surprise that they've not done anything to fix it. I've opened https://vulkan.lunarg.com/issue/view/656df8aa5df1125b58afb491 to remedy that.

AnyOldName3 avatar Dec 04 '23 16:12 AnyOldName3

I got an email last night saying they'd resolved that, and the next version of the Windows Vulkan SDK will include the CMake config files. That means simply using find_package(glslang CONFIG) will work on all platforms when that releases, provided we're okay either bumping the minimum required Vulkan SDK version on Windows, or telling Windows developers they'll need to build glslang themselves and add it to their CMake prefix path if they want to use an older version.

AnyOldName3 avatar Jan 04 '24 18:01 AnyOldName3

Thanks for following that up Chris. Once the next VulkanSDK is out we can create a branch of the VSG and start experimenting with pulling in glslang from external sources again.

There may be dragons lying quietly to be awoken once we start testing this out across all platforms but hopefully this time around glslang will be more stable base to build against and we can finally get rid of the internal glslang for the next stable release which will be v1.2.0. I don't have a time frame for this release as it'll depend upon how glslang as well as many other items converge.

robertosfield avatar Jan 04 '24 18:01 robertosfield

Disappointingly, I've discovered that find_pacakge(glslang CONFIG) won't work yet in the MinGW UCRT64 environment provided by MSYS2 (and therefore probably any other MSYS2-provided environment) as the glslang version predates this MR https://github.com/KhronosGroup/glslang/pull/3420, which is needed to use the glslang::SPIRV target that provides <SPIRV/GlslangToSpv.h>. I suspect the same problem exists on Arch Linux, as that uses the same version, and if Arch's version of something is too old, it's usually a sign that every other distro's going to have the same problem.

This is a particular problem for MSYS2-provided MinGW UCRT64 as the version of glslang that the VSG currently fetches predates https://github.com/KhronosGroup/glslang/pull/3144, so that doesn't build, either.

I think that everything would be fine if we did find_package(glslang 14 CONFIG), and then, if after that glslang_FOUND was still unset, fell back to fetching a recent upstream glslang commit (whether via FetchContent or something resembling the existing approach). I don't think the fork's necessary anymore. It at least worked when I set it up that way under MSYS2's MinGW UCRT64 environment.

AnyOldName3 avatar Jan 19 '24 20:01 AnyOldName3

Perhaps it would be worth creating a spreadsheet/table with OS/build tool combinations and the glslang version that comes with them, and if it's possible to find out when it'll be upgrade to the required glslang version if it isn't already modern often.

Perhaps requiring users use a modern VulkanSDK for OS/build tool combination when their OS/build tool doesn't out of the box have a modern enough version,

robertosfield avatar Jan 19 '24 21:01 robertosfield

@psi29a & @AnyOldName3

I had just occurred to me that another approach for glslang support would be to dynamic load glslang.so at runtime when it's required and then import the symbols we use. This would avoid the need for compiling against glslang. I am also wondering if we could do this for Xcb and Wayland support as well.

robertosfield avatar Feb 25 '24 20:02 robertosfield

You might have seen that I've submitted a slew of PRs to glslang, and as well as that, I've got a branch locally where glslang is grabbed via find_package that should work everywhere that has glslang 14 available. I did this because I was trying a build via MSYS2, which has GCC 13.x, and the glslang version the VSG uses at the moment can't be built with that version. Unfortunately, glslang 14 isn't available everywhere yet (in fact, it's really just Windows where you can expect it - the first Vulkan SDK build that included the glslang CMake config had glslang 14, and MSYS2 provides glslang 14, but on the Unix end, not even Arch is on glslang 14 yet). To that end, I made a CMake option to specify an older version (most of the issues were fixed by glslang 13, so on lots of platforms, that should be fine), but I also tried making it so that we could use a nested build of upstream glslang as a fallback. Obviously, glslang 14 had already released by the time I started this, so it wouldn't be exactly the same version as find_package was looking for, but it should be basically the same as it's only a few weeks worth of difference.

The problems I hit with upstream glslang were:

  • Some inconsistencies with how it needs to be used via find_package and as a nested build, like different relative include paths. I haven't submitted a fix for this yet, but it's something the maintainers have discussed as needing doing at some point a bunch of times, so there's no reason not to think something would be accepted. I think I've submitted PRs for every other inconsistency.

  • Someone disabling the installation of glslang when it's a nested build. This is bad because if it's a static library, then if the VSG's a static library too, whoever consumes the VSG needs to have its dependencies to link against, too. This is bad if it's a shared library as you need the shared libraries at runtime. This got fixed by someone else already, but was a problem when I started investigating.

  • All the warnings that are set off when it's built using the VSG's compiler flags. The glslang maintainers enabled some of the warnings, but they don't like -Wshadow as they don't like prefixing member variables or function parameters, so have lots of shadowing. Apparently they're not weird for having this opinion - even Bjarne Stroustrup thinks this is a silly warning, and the C++ Core Guidelines don't have any recommendations for prefixing member variables with m or _.

    As they point out, though, this difference of opinion shouldn't be a problem as the VSG shouldn't be propagating its warning settings to its dependencies. This is happening at the moment because they're set via add_compile_options, which affects everything in the current directory and all subdirectories, and is best reserved for things that affect linking that aren't dealt with elsewhere, which isn't much in well-written modern CMake. It would be generally considered better practice to set warning flags as private target compile options. Do that, and the problems with glslang triggering warnings instantly go away. This isn't something that we can do immediately as a one-line change as vsg_setup_build_vars is called before the targets it needs to affect actually exist, and some of the other things it does need to happen before those targets are declared. One option would be to change those to also operate on a target rather than a directory property or directory-scoped variable, then the whole macro call could just be moved later and take a target name as an argument. The only disadvantage would be that for projects like vsgExamples, the macro would need calling for each target whereas the existing approach sets things globally. The only other option would be to set the warnings as a public target property so they'd be propagated to every project that consumed the VSG, but that would force projects that don't currently use -Wshadow to start, and people might be unhappy with that.

AnyOldName3 avatar Feb 26 '24 15:02 AnyOldName3