zig icon indicating copy to clipboard operation
zig copied to clipboard

make the main zig executable no longer depend on LLVM, LLD, and Clang libraries

Open andrewrk opened this issue 1 year ago • 202 comments

This issue is to fully eliminate LLVM, Clang, and LLD libraries from the Zig project. The remaining ties to these projects are as follows:

  • [ ] #8726
    • [x] #8727
    • [ ] #17749
    • [ ] #17751
    • [ ] #17750
  • [ ] LLVM
    • [x] #13265
    • [x] C backend - 1742/1792 tests passing (97%) (@jacobly0 et al.)
    • [ ] x86 backend - 1721/1792 tests passing (96%) (@jacobly0, @kubkon, et al.)
    • [ ] wasm backend - 1611/1765 tests passing (91%) (@Luukdegram)
    • [ ] #21172
    • [ ] ~optimization passes~ not actually a prerequisite for this proposal. The clang package mentioned below will be able to provide LLVM optimizations as well.
    • [ ] #17807
  • [ ] Clang
    • [ ] ~#16268~ not actually a prerequisite for this proposal
    • [ ] ~#16269~ not actually a prerequisite for this proposal
    • [x] C++ source files in zig's repository are built by clang when bootstrapping
      • [x] #15657
    • [ ] #21169
    • [ ] ~Compiling C++, Objective-C, Objective-C++, etc. files.~ Solve with a clang package that can be depended on via the zig build system.
    • [ ] #20875
    • [x] #17752
    • [x] #17753
    • [ ] #20630
  • [ ] #9828

Note that there would still be an LLVM backend for outputting .bc files (#13265), but the Zig compiler would lack the capability to compile .bc files into object files. LLVM or Clang would need to be installed and invoked separately for that use case.

In exchange, Zig gains these benefits:

  • All our bugs are belong to us.
  • The compiler becomes trivial to build from source and to bootstrap with only a C compiler on the host system.
  • We stop dealing with annoying problems introduced by Linux distributions and package managers such as Homebrew related to LLVM, Clang, and LLD. There have been and continue to be many.
  • The Zig compiler binary goes from about 150 MiB to 5 MiB.
  • Compilation speed is increased by orders of magnitude.
  • We can implement our own optimization passes that push the state of the art of computing forward.
  • We can attract research projects such as alive2
  • We can attract direct contributions from Intel, ARM, RISC-V chip manufacturers, etc., who have a vested interest in making our machine code better on their CPUs.

Please read my other comments in this issue before replying:

  • https://github.com/ziglang/zig/issues/16270#issuecomment-1615388680
  • https://github.com/ziglang/zig/issues/16270#issuecomment-1616115039

andrewrk avatar Jun 29 '23 22:06 andrewrk

I see the 1.0.0 milestone for this, is it meant to be a blocker for 1.0? or simply the goal

xdBronch avatar Jun 29 '23 23:06 xdBronch

The milestone, on an issue labeled "proposal", means that a decision must be made to accept or reject that proposal before tagging the release corresponding to that milestone.

For an issue labeled "accepted", the milestone means that it must be implemented by then. So, if this proposal is accepted, then I will evaluate at that time which milestone to move it to.

andrewrk avatar Jun 29 '23 23:06 andrewrk

In the near term, the machine code generated by Zig will become less competitive. Long-term, it may catch up or even surpass LLVM and GCC.

IMO, this is the biggest question. One of the most compelling reasons to use Zig is runtime performance of software written in Zig. Without LLVM's optimization passes, what will that look like?

Jarred-Sumner avatar Jun 29 '23 23:06 Jarred-Sumner

Long-term, it may catch up or even surpass LLVM and GCC.

We can implement our own optimization passes that push the state of the art of computing forward. We can attract research projects such as alive2 We can attract direct contributions from Intel, ARM, RISC-V chip manufacturers, etc., who have a vested interest in making our machine code better on their CPUs.

Zig will continue to implement optimization passes of its own over time and get faster.

nektro avatar Jun 29 '23 23:06 nektro

So here the projects that depend on the ability to compile C++ that I currently developing:

  • https://github.com/Cold-Bytes-Games/wwise-zig: I am using the C++ ability to compile glue code to create a C binding. I plan to use this for audio in my game BioMech Catalyst written in Zig.
  • https://github.com/Cold-Bytes-Games/wwise-zig-demo: Using previous mentioned library plus zgui that does a nice Zig binding on top of Dear IMGUI
  • Still planning to use zgui for my game editor in the future.

And also some NDA game platform that have C++ only API that will require some C++-to-C glue code to be compiled, but obviously not implemented.

To me, the ability to seamlessly build any C, C++ and Obj-C is a big selling point of the Zig toolchain even if it is behind a optional flag to enable LLVM and Clang when compiling Zig. A part of the hype momentum around Zig is due to that fact.

If this happens, I think I will remove my donation to the Zig Software Foundation.

TL;DR: Lots of libraries in the game development world (closed or open source) require the ability to compile C++.

mlarouche avatar Jun 30 '23 00:06 mlarouche

I think this proposal would hurt the Zig ecosystem more than it would help it, due to several reasons:

  • Adoption of compilers and languages in the embedded is basically driven by the code generation quality in terms of size. We have to be on-par or better than GCC and LLVM to be even considered tmto be adopted. 2%-5% bigger code is often not an annoyance, but a technical problem.
  • A lot of projects use C++ code and might benefit from build.zig "as is". This could lead to adoption of Zig, with the build system as a kick-start. Removing support for C++ will hurt adoption a lot, because now the build of C++ projects with some additional Zig code will be even worse than just adding more cmake. Thus, a lot of projects wouldnt benefit from Zig in terms of quality of life, as it is yet another build tool.
  • Zig is with its "batteries included" cross-compilation the best toolchain for doing native development. We would basically downgrade our toolchain to something Go where we rely on external tools to successfully build more complex projects. Even in the current state, its often so much easier to yeet zig at something to get it to build than even trying to get a cross-build running with existing tools
  • Considering the scenario i have at work, introducing Zig as a build environment for several million lines of C++ would make a lot of people happy, as the builds would be faster, easier maintainable and trivially portable to other OS. With C++ and LLVM support removed, i dont see any chance of Zig adoption at this company

Imho, this proposal strongly violates the

Together we serve end users

idea, as the current direction we're heading is a really good unified native build environment based on a single static executable that can serve projects in arbitrar, sizes, shipping compilers for several major languages, a huge ass support for targets and a build system, making work in systems programming fun, even if one doesnt use Zig as a language.

We are on a good way to replace a huge list of tools with a single executable, making contributions to projects build on Zig fun, easy and platform independent.

When this proposal is accepted, in addition to Zig one will need to have the following tools installed:

  • (GNU) Make
  • premake
  • meson
  • CMake
  • ninja
  • $(arch)-$(os)-$(abi)-gcc
  • llvm/clang
  • GNU autotools
  • m4
  • vcpkg
  • gradle
  • conan
  • qmake
  • SCons
  • maven
  • ...

We can potentially replace all of those tools wit a single, equally powerful executable, making the live easier for all native devs out there

personal projects that would be affected:

  • https://github.com/MasterQ32/cg-workbench (fully written in C++)
  • https://github.com/MasterQ32/zero-graphics (Scintilla is written in C++)
  • https://github.com/MasterQ32/zig-assimp (Assimp is the de-facto standard solution for generic 3D model loading, written in C++)

ikskuh avatar Jun 30 '23 00:06 ikskuh

I'm still a beginner I believe, but if it is at all worth it for me to give my point of view, Zig's ability to replace all of the build tools mentioned above is a big reason I was interested in Zig in the first place. I struggled a lot with all the different build systems and Zig is a really refreshing breath of fresh air.

I have two personal projects that use zgui heavily and I had planned on continuing.

foxnne avatar Jun 30 '23 00:06 foxnne

With Mach engine we have two dependencies that would be very, very painful to remove (would set us back years):

  1. DirectXShaderCompiler (DXC) - a Microsoft fork of LLVM 3.7, which generates DXIR (LLVM IR) which is what direct3d graphics drivers consume. Microsoft is trying to upstream support for emitting DXIR (specifically LLVM v3.7 IR) to the latest LLVM/clang version.
  2. Calling into Apple's Metal shader compiler - which converts Metal's text shading language into (yet another fork) LLVM IR bytecode, which is what the Metal API consumes - only this one is proprietary and undocumented. Obj-C is the only way to invoke it, I believe.

It will be a long time before Zig's SPIRV backend is capable enough to generate non-GPGPU shaders for graphics APIs (if ever, since it would likely require major language changes) - so I don't see a way for us to escape these aside from replicating what these two projects -- LLVM forks -- do on our side.

Every other C++ dependency I believe we could safely escape from.

emidoots avatar Jun 30 '23 00:06 emidoots

A positive long-term effect of this change is that it would push us as a community away from wrapping C++ code and towards more pure-Zig solutions.

Many of the comments in this thread are about people using zgui, wwise, zig-gamedev, assimp, and other C++ libraries wrapped with Zig. It gives you a leg up in the short term, but I worry in the long-term that people's gamedev experiences coming to Zig will be 'initially I saw a nice language... then I encountered the guts of the libraries I was told to use were large, clunky C++ codebases'

emidoots avatar Jun 30 '23 00:06 emidoots

Also a beginner, but I think being able to use Zig as a C++ build system (which is what I use for all my private C++ projects) is an invaluable feature to me and I believe many other people. The simplicity of Zig, having a compiler and a build system contained within a single executable (with the added bonus of being easily cross platform), is really cool, and it would be kinda unfortunate to see that feature be removed as an effect of removing clang and friends. However, if this does go through, the ability to fallback to generating .bc files and invoking clang is nice.

GeffDev avatar Jun 30 '23 00:06 GeffDev

One of my friends pointed out that nothing stops you from invoking clang in a build.zig to compile C++ dependencies, even if Zig stops including clang. I wonder how much of a problem that would really be for these projects?

ghost avatar Jun 30 '23 00:06 ghost

doing that would be an additional dependency without the ease of zigs cross compilation

xdBronch avatar Jun 30 '23 00:06 xdBronch

Hmm, the more I think about this proposal the less I like it. I'm feeling it would be better to make the LLVM backend non-default (that is, switch -fno-LLVM for -fLLVM or something) but not remove it. It seems like this would solve most of the issues: the existence of many bugs that are LLVM's fault, and the slow compile speed, aren't strong reasons to remove the LLVM backend so much as make it non-default. It wouldn't solve the issues of the binary size of the zig compiler or difficulties with building it, of course.

ghost avatar Jun 30 '23 01:06 ghost

I think I can speak for all gamedevs by saying that removing C++ compilation would be a disaster. Too many amazing existing gamedev libraries and tools are built on C++ that disallowing easy use would strangle adoption in that field, as well as complicate existing projects. dear imgui is the obvious example but it's not the only one.

musi-musi avatar Jun 30 '23 01:06 musi-musi

It's easy to imagine that in the long term, this change will push us towards "rewrite it in zig" with all the benefits that would entail, but the downside is that the existing corpus becomes inaccessible; limiting our options heavily even once zigs ecosystem matures

musi-musi avatar Jun 30 '23 01:06 musi-musi

I think llvm is needed until ecosystem of pure-zig library is very very mature and rich.

Yeah we want faster compiling speed and smaller tarball, but not at the risk of losing one-zig-to-rule-all.

Jack-Ji avatar Jun 30 '23 01:06 Jack-Ji

I love the ambition of this proposal, but to reiterate what has already been stated, losing c++ compilation would be losing one of the main selling points of zig. I was drawn to zig in part because it reduces the hellishness of depending on c/c++ projects. Zig having a c++ compiler inside it also has the benefit of there being less c++ to have to deal with, less python to have to deal with, less cmake to have to deal with.

On the other hand, the core of this proposal has too many benefits for it to be rejected entirely. I think reducing the scope would be beneficial. How about:

  • Eliminate dependencies on LLVM & LLD
  • Keep clang as an optional dependency for easy cross-compilation

ethernetsellout avatar Jun 30 '23 01:06 ethernetsellout

I totally get the desire to get rid of huge third-party dependencies that bring a lot of baggage. There's also something to be said for avoiding an LLVM monoculture in the programming language space.

Even so, I see this as a net negative for users. What drew me to Zig in the first place was the pragmatic approach of acknowledging that there is a world outside of Zig that needs to be interoperated with for the foreseeable future and even providing a best-in-class cross-compilation experience along with that. I built my project integrating Zig build support with the .NET/MSBuild ecosystem on that selling point.

On the whole, I think this proposal would be an unfortunate (if well-intentioned) bait-and-switch, considering the Zig website for a while has advertised this:

image

In addition, these blog posts drove a lot of attention to Zig in the past:

Just to be super clear: I don't mean to insinuate bad faith or anything of the sort here. But I think it's fair to say that you have to contend with the fact that this proposal would pull features that are not only usable today, but are also prominently advertised.


All that said... assuming this is even remotely practical, maybe there's a potential middle ground: Would it be possible for Zig to continue to use the Clang frontend to provide C-family support, but rip out the LLVM IR lowering and replace it with lowering to ZIR/AIR? (I guess this is more or less how Aro would be integrated too?)

If this could be done, the codegen dependency on LLVM could be killed, achieving at least some of the goals of this proposal. There's probably also no reason to keep LLD support around as long as zld can catch up, so that eventually goes too. And users remain happy. Some of the build and distro woes would remain, of course, but, that's compromise.

alexrp avatar Jun 30 '23 01:06 alexrp

I’m fully in favor of making it possible to use Zig without any LLVM components, but I agree with many of the comments here that it’s important for Zig to maintain the capabilities that it currently has in terms of cross-compilation, compiling C/C++ code seamlessly, and generating maximally performant binaries.

These factors are big drivers of Zig’s adoption, and I fear that damaging them (even temporarily as in the case of code generation quality) would seriously hurt Zig’s future.

Personally, I work in the robotics space, where C++ is the dominant language for many libraries and frameworks. I think Zig has a lot of potential in this space, but being able to integrate with existing libraries is absolutely essential for adoption.

AdamGoertz avatar Jun 30 '23 02:06 AdamGoertz

(Please ignore my original comment. My use case is unchanged and I temporarily conflated two components.)

Is there a story in this proposal for JIT compilation? I have a Zig project with currently relies on LLVM's Orc to JIT-compile audio DSP functions. I'm not particularly stoked about or attached to Orc itself, but it does give me in-process compilation with low-milliseconds latency, something I'd need for dynamic real-time audio applications.

WebAssembly would not be an issue here because I wrote the Wasm compiler myself, but for x86_64 and friends I would need a replacement. Passing LLVM bitcode to a separate process might work but would feel like a downgrade. Vendoring and embedding the Zig compiler source might be the best option in that case.

(Aside: I have started work on reading and writing LLVM bitcode from Zig, and if this is accepted would be happy to resume work on that.)

hryx avatar Jun 30 '23 02:06 hryx

I'll start by stating my opinion: the C language frontends are super important to me for Zig to maintain.

I agree with others on the point that Zig being a C/C++ compiler is a big point of attraction that brought me to the language. I started with loving that idea, and ultimately fell in the love with the language, and now I use both. I don't have anything more to add to that that the others above haven't.

I'll add my own personal experience. I have many personal Zig projects, but my biggest one that people tend to know about in the community is my terminal emulator. There are two important dependencies that would be impacted by this:

  • Harfbuzz - The far and away single most complete cross platform text shaping engine. Text shaping for those that don't know is the process of laying out text, processing things like multi-codepoint emoji into single glyphs, Asian language handling, etc. This is not something you want to really maintain in your own language (i.e. write natively in Zig) because Harfbuzz is so good and so well supported. This is shipped as a single large C++ file with no other dependencies. Without access to Harfbuzz, gaming and manual-GUI applications (not using a mega framework) would surely suffer.

  • Objective-C for Mac work. Admittedly, this is probably deprecated over the long term since Apple is pushing very hard into Swift. Still, I use Zig's ability to compile Objective-C files to augment my native Zig Mac applications. I've personally written Zig objective-C bindings using the C API, but I also still like to just write some ObjC sometimes which would otherwise take a big mess of Zig/ObjC-runtime code. Again, I think I could find a way around this one, but I think its worth noting for now that ObjC is still a part of [low-level] macOS GUI development.

Andrew, your dislike of C++ is well known! I don't love it either (to put it kindly). If, as an audacious goal, you wanted Zig to lower C++ usage, I think Zig being able to compile existing projects and enable iterative migration away from C++ to Zig is the way to do it. I think if Zig doesn't support C++, the C++ "people" would just avoid Zig altogether.

mitchellh avatar Jun 30 '23 03:06 mitchellh

Main gripe seems to be the need/desire to be able to compile C++ due to the mountain of pre-existing libraries that are already being used in the Zig ecosystem. For this proposal to satisfy people and not to break a huge selling point for Zig, a C++ (and ObjC?) compiler in Zig would have to be up-streamed.

Dunno what madman will be the one to write that tho.

At the very least, any kind of serious effort towards removing LLVM should be done after reaching 1.0.

That said, I don't think divorcing from LLVM itself is a bad idea, if the Zig toolchain itself can slowly grow to replace its functionality, just so long as it sticks around until opting out of it doesn't result in Zig losing out on its existing features.

TUSF avatar Jun 30 '23 04:06 TUSF

While I can see this being a good thing in theory I like others have concerns over what it will do more short term. Zig to me is expected to be a highly performant language competitive with languages like C, if it is not performant then it impacts my ability to use it for writing the projects I am making in it because it simply will be a worse option in practice for such things.

Perhaps this could be mitigated if the C backend becomes fully functional to allow projects to compile to C code and have those compile with typical C compilers for when performance is needed, but otherwise I do not think this would be well-advised until some sort of reasonable performance guarantee can be made. Even with the Zig->C->Machine Code process I feel like you'd be losing some optimization potential as no longer would Zig be able to annotate LLVM IR directly and would instead be confined by whatever C can express language wise so that might not even be a foolproof option either (though at the very least it'd hopefully put it on the level of C for most things).

Edit: Apparently I've been told this is what the bitcode support could be used for, I've never used LLVM bitcode stuff myself before but yeah if that's a supported target and still gets focus knowing people will be using it to generate higher quality optimized code until Zig can compete maybe this is less of an issue, albeit a bit more convoluted in how a project would have to be compiled.

Frankly with how insanely complex x86 is and how much work has gone into LLVM over the years I am doubtful Zig would ever reach the same standard of performance. Other larger languages like Rust which aren't even as entangled with C++ haven't tackled this sort of challenge yet either for instance despite the some similar motivation to and more developer resources at their disposal, and to me that is not a very promising sign for its feasibility (though of course Zig could always be the first thing to prove this long-held mindset of LLVM being impossible to replace wrong...).

Also as an aside while I am not as invested in the C++ compilation support as others may be I do think that it'd hurt a lot of projects. Being a gamedev myself losing the ability to use ImGui would be unfortunate as others have mentioned, and personally I also use Tracy in some of my Zig projects which is also C++-based. It wouldn't be too hard to just compile these libraries and link to their binaries (or use a system library I suppose) but still that just makes Zig a bit more pain to interface with this stuff.

presentfactory avatar Jun 30 '23 04:06 presentfactory

Another thing that I don't think has been mentioned/considered yet, if this proposal were to go through, and that were to happen around or after the time that 1.0 is released, it could cause a split in the zig community. Some people who rely on zig's current toolchain might choose to simply stick to an old version of zig so they aren't forced to migrate their codebase. This would harm everyone involved. Users and devs on the old version would miss out on any future features and optimizations, while the other ones, who chose to update, would be unable to use any of these libraries within their own projects without jumping through hoops to do so.

xdBronch avatar Jun 30 '23 04:06 xdBronch

I agree that apart from the mentioned problems with LLVM, it also struggles with the a baggage of legacy code, e.g. in TableGen modules, in how Clang is tied to LLVM compared to the newer compilers that use middle-level intermediate languages, how story of migration from FastISel and SelectionDAG to GlobalISel stalled for many years and isn't really progressing for all supported architectures, and so on and so forth. Using C++ language for writing such a complex piece of software doesn't help either. But even Rust didn't dare to get rid of it just yet. Thus, I think it would be prematurely to do that for Zig either. Long-term it might be a worthy goal, but definitely not in the upcoming 5 years or so, in my opinion. Just my 2c.

I second @yujiri8 here, having optional LLVM target for many years while working on the Zig backends would be a perfect strategy, reducing the maintenance of LLVM parts and providing room for experimentation and optimization of the mainstream targets directly in the Zig code: ARM64 and x86_64, probably RISC-V in the future, if it really takes off. GHC (Haskell) uses the similar approach for at least a decade already, it seems to work for them.

P.S. Why PE format is not in the list for linker?

XVilka avatar Jun 30 '23 05:06 XVilka

Maybe the best long-term solution would be to offer some kind of plugin system for the zig compiler.

Pros:

  • The compiler itself could take advantage of all the stuff listed by Andrew
  • Stuff that the community considers important, like compiling c++ could still be supported
  • You only "pay" for what you are actually using
  • We need some way to interact with different compilers/tool anyway to take full advantage of the c-backend, e.g. compiling zig code for some exotic platform that only has an compiler for c. This could be a more stable approach than just scripting with build.zig.
  • It would make it easier for people to add interoperability for even more languages than just c/c++ and experimenting with features in general
  • Could result in some federation. While a second implementation of zig compiler is a long time away, having a plugin system would allow different implementation of parts by the community, e.g. clang and arocc

Cons:

  • Duck tons of work
  • Even more stuff breaking when a new compiler version releases

PhilippWendel avatar Jun 30 '23 05:06 PhilippWendel

At least create a new external project to mantain the goodness of "zig cc c++ bpf..." and builtin platform SDKs .h and cross compilation!!

ducktype avatar Jun 30 '23 06:06 ducktype

There are a lot of people worried that loosing C++-support would be a problem.

I am curious how much of those problems would still remain if Aro (or something else) somehow over the year gained C++-parsing support. Would that help 50% of the worried people? 100%? 0%? There is a difference between parsing C++ and making libraries that are binary compatible with it so it might be harder that just parsing it so that is why I am curious!

An alternative idea here might be to try to remove LLVM/clang over time and also add C++-parsing support to Aro (or something else) such that when the switch actually happens, nothing goes away (or at least C++ doesn't go away).

Zig has ended up in a weird situation where on one hand it is a language but then maybe people like it because of how good it is at cross-compiling and mixing languages.

breakin avatar Jun 30 '23 06:06 breakin

With my limited insight into the project, the fundamental reason for removing LLVM integration doesn't quite make sense to me. All the reasons for LLVM replacements listed here sound like good, valuable improvements that can be implemented without removing LLVM integration. I see that LLVM is "annoying because it's large, slow, and has bugs" - but why remove the option to use it?

An alternative to full removal would be to make LLVM integration less of a priority:

  • Shift LLVM maintenance to dedicated team members / dedicated resource and time budget.
  • Reduce the scope of expectations - "LLVM is only tested/verified for these targets" and declare bug fixing for anything beyond out-of-scope.
    • Maybe people would come together to set up their own independent funding around this. I'm sure there are enough well-intentioned people in the community that would be willing to maintain it while trying not to slow down core Zig.

Moreover, the current Zig compiler already has a fully modular structure. LLVM is (as far as I understand it) already completely optional. (The fully-self-hosted path isn't finished yet - but nothing seems to prevent it from working besides time left in the oven.)

  • Is the current structure not working or suboptimal in some regard?
  • Do we have some rough sketch of time investment with LLVM vs without to reach some goal post result X?
    • Zig as a whole already looks so mature, and the integration already works so well for so many use cases, that it's a bit surprising to consider this part of the project (as opposed to self-hosted components) to require that much additional time investment going forward.

rohlem avatar Jun 30 '23 07:06 rohlem

Too many hours of work have been put into LLVM. The optimization passes, the IR and everything are really the best on the market right now and are constantly being improved. While I'm sure that zig's own passes could get to an incredibly good state, in the long term, improving LLVM instead is probably still what would provide optimal executable performance, and would also benefit the whole ecosystem.

If dropping LLVM dramatically improves compilation speed, then it could be used during development, for projects that don't require C++ or objective-C, in order to keep a fast development feedback cycle.

ecstrema avatar Jun 30 '23 07:06 ecstrema