Uploaded SBOM or VEX silently ignored if the first line is empty
Current Behavior
It seems that if I upload a BOM json document with first line being empty the contents are ignored.
Steps to Reproduce
- Create a new project
- Upload bom.json
- Observe that no components are added
Expected Behavior
The contents are not ignored.
Ignoring the contents silently instead of returning an error makes this issue much worse. It's rather difficult to understand what happened if you don't know.
Dependency-Track Version
4.13.1
Dependency-Track Distribution
Container Image
Database Server
PostgreSQL
Database Server Version
No response
Browser
N/A
Checklist
- [x] I have read and understand the contributing guidelines
- [x] I have checked the existing issues for whether this defect was already reported
If I remove the first empty line the upload works as expected
I suspect the problem is that the underlying library inspects the first few bytes on the file to decide which format the file is in.
We should instead do something similar to what we're doing for schema validation to make this more bullet-proof.
@nscuro if it's deciding based on the first byte then that would explain what I observe. Yet I don't understand why the upload completes successfully instead of returning an error?
FWIW Jackson can also work with XML, I think the code is using JAXB instead for XML. Maybe using Jackson for everything could simplify the code.
Yet I don't understand why the upload completes successfully instead of returning an error?
Only the schema validation happens synchronously with the request. The schema validation logic is able to correctly determine the file format because it's "streaming" through the file until it finds what it's looking for. For all it's concerned, your file is valid.
Processing of the file happens asynchronously (doing it synchronously would have huge potential to time out and block server threads). For processing, the code still uses the wonky library code I linked above.
FWIW Jackson can also work with XML, I think the code is using JAXB instead for XML. Maybe using Jackson for everything could simplify the code.
jackson-databind uses Woodstox behind the scenes for XML and does not expose a streaming API on its own. We use Woodstox directly here because we want to avoid expensive copying or even deserialization of the BOM file.
There are cases where uploaded files will be hundreds of megabytes in size. What we want to avoid is spending lots of resources on files that aren't even valid. Even supposedly simple actions like this cause the entire file content to be copied.
Processing of the file happens asynchronously (doing it synchronously would have huge potential to time out and block server threads).
Maybe the /token API could be extended to report processing errors?
We use Woodstox directly here because we want to avoid expensive copying or even deserialization of the BOM file.
Makes sense
Maybe the /token API could be extended to report processing errors?
This will come in v5, and in fact it will come for many more asynchronous processes, not just BOM uploads.
FTR the same affects VEX upload