dependency-track icon indicating copy to clipboard operation
dependency-track copied to clipboard

Scan import overwriting user-specified data (eg licences)

Open msymons opened this issue 7 years ago • 8 comments

Issue Type: Defect

Current Behavior:

When a project is created in Dependency-Track v3.3.1 and a CycloneDX BOM is uploaded, then the project will be populated with components and the components will have licenses assigned where the BOM provides such info.

When a fresh BOM is uploaded then Dependency-Track is overwriting all existing license information. Specifically:

  • Components where the license had been unknown and I had performed research and manually assigned a license. After the fresh import, the license in DT is now "unknown" again if that is what is in the new BOM.
  • Variant: components where the license had been displayed as text (no link) and I had performed research to double-check and manually assigned a license (ie, license now has link). After the fresh import, the license in DT is now reverted from link to text.
  • Variant: not a problem. components where the license had not been included in the cyclonedx-maven-plugin v1.2.0 BOM and then included in the v1.3.0 BOM (the version which now "...takes into consideration the entire inheritance tree of every single direct and transitive dependency") does result in the updating of the license in DT (displaying as text).

Steps to Reproduce (if defect):

  1. Create a project manually
  2. Generate a BOM using cyclonedx-maven-plugin v1.2.0
  3. Generate a BOM using cyclonedx-maven-plugin v1.3.0
  4. Compare the two BOM to ensure that license count increases between 1.2.0 and 1.3.0 BOMs (even if the increase is just 1 or 2).
  5. Upload the cyclonedx-maven-plugin v1.2.0 BOM.
  6. Within DT, assign a couple of licences where DT is displaying "-"
  7. Within DT, assign a couple of licenses where DT is displaying text (eg such that "The Apache Software License, Version 2.0" becomes "Apache-2.0" (with link)).
  8. Upload the cyclonedx-maven-plugin v1.3.0 BOM.
  9. Check what has happened to licenses tweaked in steps 6 and 7.

Expected Behavior:

BOM processing should not overwrite licenses where "license" has been transformed to "resolvedLicense", at least not using current system that does not yet have formal audit process.

msymons avatar Dec 05 '18 18:12 msymons

@stevespringett, have you yet had any thoughts on the above?

I have done additional testing and noted that scan import of XML from dependency-check plugin has the same behavoiur. In the context, I think this is exactly what I would expect to see happening.

I also saw that user-edited fields such as "Description" are overwritten. Which, in my case, mostly means "wiped".

I'll thus edit issue title from "CycloneDX BOM import overwriting user-specified licences" to "Scan import overwriting user-specified data (eg licences)".

I have gone through all my existing components and extracted all remaining "resolved" licences (those edited by me) that are of interest (ie, any that are not Liberal). This should allow me to "hook up" a whole pile of additional projects. I know that doing so will revert my own edits done via UI. I also know that getting Continious Delivery implemented via pipelines means that I cannot make any additional edits without immediately losing them.

I'll just have to track my research on unknown licences outside of Dependency-Track. Ditto for dual-licence components - although I see that v3.4 will have some changes there.

In discussing things with colleagues, I was told that this is something that "we can live with for now as long as a fix is on the roadmap".

is there any extra info that you need that might help?

msymons avatar Dec 10 '18 16:12 msymons

@msymons This isn't a defect, rather the way the system is designed. The way components are resolved and updated has existed prior to the ability to manually add/edit components.

There was two simple use-cases:

  1. BOMs are ingested during CI/CD
  2. Components were added/updated manually (intended for ecosystems without commonly implemented dependency-management; i.e. C/C++)

Implementing both of these use-cases together is not something that is currently supported. Descriptions, licenses, etc will always be overwritten by a BOM since a BOM, by definition, is a statement of fact.

I see a few possible ways to achieve what you want:

  1. Implement a global configurable option that would prevent updates to an existing component from a BOM or scan upload. This would allow you to make manual changes and not have those changes wiped out due to BOMs overwriting them. It would also prevent future versions of CycloneDX implementations (Maven, npm, python, etc) from making corrections or enhancements to existing data, as was the case with the recent license enhancement.

  2. Implement per-field sticky bits where a BOM will continue to update all fields, but on a per-field/per-component basis, be able to manually specify an alternative value. This will require changes in the data model, REST API, and UI in multiple places and is not as easy to implement.

Option 1 will likely be easily achievable without much effort. Option 2 will require substantially more effort and as a result, would be lower in priority due to the fact that the amount of effort required doesn't align with the primary goal of the project; reducing risk. If you're ok with option 1, I'll mark as enhancement and get started on it after the release of v3.4 next week. If option 2, then I'll also mark as 'help wanted' and be grateful for a pull request.

stevespringett avatar Dec 10 '18 18:12 stevespringett

I bit the bullet, creating another 13 projects in Dependency-Track and uploading BOM files for both these 13 and 13 of my original projects, such that all are now based on BOM from CycloneDX Maven Plugin v1.3.0. ie, per #248:

...much improved metadata extraction. This takes into consideration the entire inheritance tree of every single direct and transitive dependency.

After having done this, I have a lot of new license info for components that were previously blank ("-"). I do still have a lot of gaps. But I think it might be possible to hold fire on deciding how to enhance behaviour (per above). ie, wait for #170, etc, and then have another think.

msymons avatar Dec 11 '18 17:12 msymons

I plan on making some enhancements here and will likely include append or sticky behavior in v3.6. This will not be specific to license, but more broad in its approach.

stevespringett avatar Jul 18 '19 19:07 stevespringett

This will end up being a much bigger change and likely take an entire release to do just this one thing. It needs to get done. Just cannot do it in v3.6. Tagging it for 3.7 in the hope I can do it next.

stevespringett avatar Sep 08 '19 03:09 stevespringett

I'm punting on this. The license support will eventually be completely refactored.

stevespringett avatar Mar 15 '21 03:03 stevespringett

this is also a topic that is concerning me right now, and I made some tests here. Did something change in the handling since this issue was created? Are there plans to improve this in near future?

Current observations: I uploaded a BOM with several components, some having valid, some unknwon, some no licenses at all, some having mixed license texts etc. Then on the components in the project:

  1. I changed a known (MIT) license in UI to something else
  2. I set a license for a component with empty license in BOM
  3. I set a license for a component with an unrecognized license id
  4. I changed license for a component which showed a duel-license text without link (MIT OR CC0-1.0)
  5. I added a new component manually in UI
  6. surpressed a policy violation

Then I made some irrelevant edits on the bom xml to ensure its parsed again (not sure if hashes/timestamps are checked to avoid re-parsing) and uploaded it again to same project, with following results to above cases:

  1. The manually changed license got reverted to MIT
  2. The manually set license was not overriden when the license field in BOM was empty
  3. The manually set license was not overriden when the license field in BOM contained an unknown value
  4. The manually changed license for the dual-license component was not overriden, so still showing what I selected before instead the dual license text
  5. The newly added component was removed
  6. the surpressed policy violation was kept and remembered correctly

So it seems, some changes are preserved while others are not. I see, that you considered BOMs as fact, which is right in theory, but practically, those BOMs get autogenerated and uploaded in CI/CD pipelines, and then need a manual review and adjustments, for which the UI would be great. Otherwise those UI features don't make much sense. Also it's hard to know what changes you can make without losing them and which not. Did anyone find a good workaround for this?

rkg-mm avatar Feb 03 '22 05:02 rkg-mm

In the interim, is there any way to preserve the license data in DepTrack? If I preprocess the BOMs, would it be sufficient to filter out any license data, or would it be necessary to fetch the current license data and inject that back into the BOM?

kiwiz avatar Sep 05 '24 18:09 kiwiz

Any news on this? We are still struggling with how to add missing licenses without them disappearing when the projects are reimported.

amlundohm avatar Jan 22 '25 10:01 amlundohm