feat(spdx): enable deterministic output
- feat(spdx): accept SOURCE_DATE_EPOCH
- feat(spdx): add an option to use UUIDv5
- test(spdx/to_format_model): validate deterministic UUID
Description
Honor SOURCE_DATE_EPOCH, add an option to use UUIDv5
- Fixes #3931
Type of change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (please discuss with the team first; Syft is 1.0 software and we won't accept breaking changes without going to 2.0)
- [ ] Documentation (updates the documentation)
- [ ] Chore (improve the developer experience, fix a test flake, etc, without changing the visible behavior of Syft)
- [ ] Performance (make Syft run faster or use less memory, without changing visible behavior much)
Checklist:
- [x] I have added unit tests that cover changed behavior
- [x] I have tested my code in common scenarios and confirmed there are no regressions
- [ ] I have added comments to my code, particularly in hard-to-understand sections
Ping. It seems I have no outstanding code review tasks here, please review when possible. Thanks!
Thanks for your patience here @dsseng. I've added needs-discussion here for us to talk about it this week during our weekly livestream to get more eyes from the team involved.
After reading the PR there are a couple things that stick out to me that I think the core team can help clean up to make this ready for syft main.
- I think we need to add proper environment variable binding in the configuration setup. I expect these new options to exist under a specific section in the syft config. If the environment variables are something like
SYFT_FORMAT_SPDX_JSON_DETERMINISTIC_UUIDandSYFT_FORMAT_SPDX_JSON_CREATED_TIMEthen I would expect them to be available under thesyft config- Let me see if I can help get that added
EDIT: looks like I was holding this wrong and everything is showing up in the config correctly and being read correctly!
-
I think this is important enough that we should add a few CLI integration tests for the new options. I want to have some recording in test fixtures that we're confident in that show the difference between the default output and the output when we enable determinism and the injected timestamp. Edit: These have been added
-
After finishing one I think we should get some documentation for free in the help text, but I'll verify that. EDIT: yep docs in the config LGTM
I'll get a 👍 from the team this Thursday and we'll get this one over the line! Thanks again.
@spiffcs Following up on the livestream discussion. If it helps make a case for something like this across the board, I am currently using the CycloneDX format because it was recommended by the Director of Cybersecurity at Innolitics when preparing filings for the FDA (not where I work). I work in Healthcare and it is helpful to ensure that we can trust but verify vendor compliance with security and detect differences in builds provided by them. Syft can help here simplify FDA compliance (no pressure :) ). Actually, that is one reason we came across Syft. We wanted to see if we could use it as another layer of tracking when things have changed and if these changes ultimately solves some of the security vulnerabilities and concerns we had from vendor supplied software.
I am now curious to look at the source code and the cyclone spec to see if how to unify the behavior with cyclonedx.
Cheers and awesome discussion!
I think all we're looking at here post livestream is the following. The core team will complete these and then get this merged.
- Getting a new issue on the board that makes this symmetrical with cyclonedx (no default behavior just the option)
- Validate that the new UUID being set via sha1 IS unique and that there is now way that the set we're calculating for could be replicated across unequal documents
- Change the option name on the ENV variable to something that communicates that
DETERMINISTIC_UUIDis being set for the document namespace and NOT for packages or otherwise.
I have not lost the thread on this one - just have a couple other things in front of the queue before we merge this in.
@spiffcs @kzantow could you please let me know what is to be done to get this merged? Or do I just have to wait until this PR is handled? Thank you!