How to handle symlinks in SPDX documents?
Hello!
I’m working on a project to build SBOMs for packages using the SPDX specification (v2.2.1). Many of our packages contain symlinks to other files (within the same package), and we were wondering exactly how these should be described in the SPDX documents we generate?
For example, our zlib package contains the following symlink: libz.so -> libz.so.1.2.11.
We have a tool, which uses tools-python, that builds SPDX documents for us. It has a function that analyzes each file contained in a package (e.g. to generate checksums for each file). When this function opens a symlink file, it de-references the symlink (e.g. libz.so) and just gets the checksums of the target file (e.g. libz.so.1.2.11). This doesn’t seem like the best thing to do because it feels like we’re saying those two files are the same thing, but they’re not. For now, we are planning to stick with this function as it is and add a comment indicating that "this file is a symlink to <some target>" and that the checksums represent those of the target.
Could you please advise on whether there is a prescribed or better way to handle symlinks that conforms with the SPDX specifications?
Thank you.
Hi @pseudoyim, it's a great question and unfortunately I don't have an answer for you on a prescribed approach here.
For the SPDX Golang tools, when analyzing a package's files it will disregard symlinks altogether -- e.g. it will just ignore the symlink and won't include it in the list of Files for that Package. I'm not convinced that's the right approach either, but it didn't feel right to include the target file as being "contained" by the Package, or describing its hashes / licenses / etc. as part of the Package's contents.
Either way, it's a good question and probably something that the project should align on, one way or the other :)
This would make a good topic for any upcoming Docfests.
The Java tools do not check for Symlinks when calculating the verification code, so it likely includes them. Like @swinslow - I'm not sure if this is the right approach or not but we already discovered one inconsistent approach.
One suggestion is to skip the files with the symlinks AND add the file paths to the excludes file list when generating the Package Verification Code. That way anyone validating the verification code would skip the symlinked files and the verification codes should match.
Linking relationships could be a really useful feature to add in SPDX 3.0 for use-cases like @pseudoyim's. I've made a comment in https://github.com/spdx/spdx-3-model/issues/5 so that this can be tracked :)
I apologize for the extremely late follow up on this thread! We greatly appreciate you all taking up this issue and addressing it in this Punch List.
Thanks for replying on this thread, @pseudoyim! I'd forgotten we started this discussion :)
This is timely as I just realized this morning that the Golang tools might be handling Packages with symlinks differently on Windows than they do on Linux. The thread at https://github.com/spdx/tools-golang/issues/117 has some details about (I'm guessing) differences in symlinks leading to getting different Package Verification Codes on Windows vs. on Linux. I need to poke around more to confirm that's what's going on, but just noting that this is less settled in the Golang tools than I'd hoped...
Some symlinks are aliases for libraries included in the same package, others point at e.g. the system's default timezone
It looks like this is still an open question - moving to 3.1 since it likely would not involve a breaking change