Halide icon indicating copy to clipboard operation
Halide copied to clipboard

API to report Halide/LLVM build info (version, commit hash, date, etc)

Open slomp opened this issue 3 months ago • 5 comments

A handy feature to have is the ability to "embed" into the build information pertaining the conditions which Halide/LLVM was built.

(There's HALIDE_VERSION_* and int get_llvm_version(); but that's too janky/coarse/abstract.)

I think the easiest way to do it without over-engineering it would be to have an API in Halide.h that returns a const char* to a string we generate internally, containing the version information (in a format that is also trivial to parse). For Example:

"Halide 20.0.2 (commit: abcdefabcdef from 2025/08/20) | LLVM 20.1.1 (commit: badc0ffee from 2025/08/20)"

Whether or not this API should also exist in HalideRuntime.h is unclear.

slomp avatar Aug 29 '25 17:08 slomp

Is this a runtime API or only for use while compiling/JITing?

Avoiding string typing is not "overengineering." If it is routinely going to be parsed, it should be a struct to start with. If it is only going to be printed, it can be a string, but no promises should be made on parsing it.

zvookin avatar Aug 29 '25 19:08 zvookin

(Aside: Ultimately, Halide will likely need to support software bill of materials (SBOM). This issue addresses a minimal use case for that general need. However, looking over SBOM mechanisms such as GitBOM, I don't think they really address the use case, in addition to being more work. The hope was work in that area would make it really easy to get LLVM's info into Halide. It sort of does in that one adds a flag to clang and the linker and it all gets collected, but the info doesn't look to be human readable without running hashes against an external database.)

zvookin avatar Aug 29 '25 22:08 zvookin

The initial description doesn't really illustrate the use case but it was elaborated in the team meeting today.

The goal is to be able to determine, ideally at commit hash level of granularity, what version of Halide and LLVM are being used in a given software development context. One may have a prebuilt Halide and not know exactly which LLVM was used or trying to find a bug in Halide code in a large application without full access to the entire stack of everything to build it. Being able to easily tell what versions of what are being used improves productivity.

Specific needs are to identify this information from:

  • Halide.h in a text editor.
  • Code running in a generator or generator like framework which will likely be surfaced via a command line flag in the generator framework.
  • The Halide library -- libHalide.a or the .so/.dylib/.dll .
  • An object file or archive generated by Halide.

The data proposed is a list of tuples of component name, component version number, component commit hash, build data, and for completeness an optional free form comment. I propose recording this in CSV format or JSON with the first two entries being the info for Halide itself and LLVM. This should be a good compromise between easily readable, flexibility, and parseable if necessary.

For Halide.h I suggest a constexpr inline routine in a new header that gets catenated into Halide.h by build system logic. In any event the work is in figuring out how to get the info needed, not how it is represented in the API. (Edit: this just append it to Halide.h thing probably doesn't work.)

GenGen gets a new flag to print this via a change to Generator.cpp .

When HL_DEBUG_CODEGEN is set to any value above zero, this info should be printed by the compiler. (Edit: Note this implies the info is available at Halide compiler compilation time. So it isn't just a concat to Halide.h after the fact, it is a generated header in the build or possibly a generated .cpp file.)

The above is a good cut point for an initial PR. Beyond this, because it is more work:

The compiler needs a way to embed this info into an object file e.g. .note,halide_build_info section. This is probably best controlled by a new feature flag. Halide's use of LLVM Module flags came up in the meeting, or at least I think that is what was being referred to, but I don't think it helps in getting all the way to the the final .o. LLVM's MCObjectFileInfo class has support for this sort of thing. cf: Adding Sections to ELF Binaries .

The same .note section should be added to libHalide.a itself.

ELF note sections can be read using llvm-readelf on the command line.

zvookin avatar Aug 29 '25 23:08 zvookin

Anyone have any idea how to get the commit hash information? I don't see any way to do it reliable short of writing code to find the .git dirs and to not include it if the build is being done from a precompiled library.

zvookin avatar Sep 12 '25 23:09 zvookin

We use setuptools-scm to compute the version number for the Python releases already. Maybe it has solved this problem enough for our purposes.

https://setuptools-scm.readthedocs.io/en/latest/usage/#builtin-mechanisms-for-obtaining-version-numbers


Currently, if you download a ZIP file from GitHub of the Halide source, the .git_archival.txt file will contain contents akin to the following:

node: f974be53197f1df87fcbb61878ab7036d775715a
node-date: 2025-09-12T14:57:31-04:00
describe-name: v20.0.0.dev0-126-gf974be53197f

alexreinking avatar Sep 13 '25 14:09 alexreinking