Incorrect License Information for Java Packages in SBOM Enrichment
Hello Team,
We recently used Parlay to enrich the SBOM generated by Trivy on a Cassandra container image and observed discrepancies in the license information provided for Java packages. After cross-checking with GitHub and Maven repositories, here are a few examples of the inconsistencies we found:
ST4: The package is licensed under BSD, but Parlay reports it as DSDP. javassist: This package has three licenses: Apache 2.0, LGPL-2.1, and MPL-1.1, but Parlay outputs SSLP-1.0. bcpkix-jdk15on: Licensed under MIT, but Parlay shows MirOS.
It would be very helpful if Parlay could include the URL or source of the license information. This would allow users to verify and trace back the data more easily.
Could you please look into this issue?
Thank you!
Hi @pooja0805 , thanks for sending this info over! Quick question, what version of Parlay are you seeing these issues on? We recently released a new version with some updates to the way we handle licenses, so it would be good to confirm whether that is responsible.
Hey @pooja0805
thanks for flagging your issue. I assume that this is about parlay ecosystems enrich? It would be great if you could share your input and how you invoke parlay.
Do you have more detailed package identifiers, i.e. what are their fully qualified purls?
In general, parlay merely looks up package information from ecosyste.ms. If there's an inaccuracy in license information, it's likely that this is an issue with their data set.
Hi @pooja0805 , thanks for sending this info over! Quick question, what version of Parlay are you seeing these issues on? We recently released a new version with some updates to the way we handle licenses, so it would be good to confirm whether that is responsible.
Thanks for the quick response @thomasschafer. I am seeing this on Parlay version 0.6.5. I can try the latest version of parlay and check if this issue persist.
Hey @pooja0805 thanks for flagging your issue. I assume that this is about
parlay ecosystems enrich? It would be great if you could share your input and how you invoke parlay. Do you have more detailed package identifiers, i.e. what are their fully qualifiedpurls?In general, parlay merely looks up package information from ecosyste.ms. If there's an inaccuracy in license information, it's likely that this is an issue with their data set.
Thanks for the quick response @mcombuechen.
Yes, this is related to Parlay Ecosystems Enrich. Here are the 2 ways how I usually invoke parlay:
trivy image <image>:<tag> -f cyclonedx | parlay ecosystems enrich - | jq . > <package>-parlay.json
OR
parlay ecosystems enrich <package>-trivy.json> | jq . > <package>-parlay.json
Also, attaching the input file as well as the parlay's output file here: Input File: cassandra-trivy.json Output File: cassandra-parlay.json
Regarding package details, here are the fully qualified PURLs for the affected packages:
ST4: pkg:maven/org.antlr/[email protected]
javassist: pkg:maven/org.javassist/[email protected]
bcpkix-jdk15on: pkg:maven/org.bouncycastle/[email protected]
Please review the above input and output files and let me know if you need any further inputs from my end.
Thanks!
Thanks @pooja0805 this is helpful.
I could not reproduce the behaviour you described; given the Cassandra SBOM you provided, I get different results:
pkg:maven/org.javassist/[email protected] -> (MPL 1.1 OR LGPL 2.1 OR Apache License 2.0)
pkg:maven/org.antlr/[email protected] -> (BSD licence)
pkg:maven/org.bouncycastle/[email protected] -> (Bouncy Castle Licence)
(the last one due to the license info on the 1.68 release)
Please report back after testing with the latest parlay release 😄
@mcombuechen Thanks for checking this!
I tested with Parlay v0.8.0 on the Cassandra image and observed that the previously reported issue is no longer occurring. However, I did notice a few other inconsistencies in the license information:
npn-boot: The correct license is Apache-2.0 & GPLv2 with Classpath Exception, but Parlay outputs (Apache-2.0 OR EPL-1.0 OR SSPL-1.0).istack-commons-runtime: The correct license is CDDL-1.1 or GPL-1.1, but Parlay does not provide any license information.
Additionally, some licenses are inconsistently formatted:
- License names are sometimes displayed in different variations, such as: --> (The Apache Software License OR Version 2.0) --> (---- Apache-2.0) --> (Apache License OR Version 2.0) --> (---- The Apache Software License OR Version 2.0) --> Expected format: Apache-2.0
- Some entries contain unclear placeholders, such as (--- []) and (---- Public Domain).
- Parlay does not follow a consistent pattern for certain licenses. For example: --> Some entries use LGPL-2.1+, while others display a long-form description: (GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License OR (at your option) any later version.)
Could Parlay standardize the license format to ensure consistency? Let me know if you need any additional details!
Thanks.
@mcombuechen Could you please review the information provided above?
@mcombuechen Just following up on this—have you had a chance to review the inconsistencies mentioned above?
Closing this issue as previously reported issue is resolved in the latest release. Created a new issue for the issues seen with the latest release of Parlay here - https://github.com/snyk/parlay/issues/116