dependency-track icon indicating copy to clipboard operation
dependency-track copied to clipboard

Uploaded SBOM or VEX silently ignored if the first line is empty

Open jakub-bochenski opened this issue 7 months ago • 7 comments

Current Behavior

It seems that if I upload a BOM json document with first line being empty the contents are ignored.

Steps to Reproduce

  1. Create a new project
  2. Upload bom.json
  3. Observe that no components are added

Expected Behavior

The contents are not ignored.

Ignoring the contents silently instead of returning an error makes this issue much worse. It's rather difficult to understand what happened if you don't know.

Dependency-Track Version

4.13.1

Dependency-Track Distribution

Container Image

Database Server

PostgreSQL

Database Server Version

No response

Browser

N/A

Checklist

jakub-bochenski avatar May 22 '25 14:05 jakub-bochenski

If I remove the first empty line the upload works as expected

jakub-bochenski avatar May 22 '25 14:05 jakub-bochenski

I suspect the problem is that the underlying library inspects the first few bytes on the file to decide which format the file is in.

We should instead do something similar to what we're doing for schema validation to make this more bullet-proof.

nscuro avatar May 22 '25 14:05 nscuro

@nscuro if it's deciding based on the first byte then that would explain what I observe. Yet I don't understand why the upload completes successfully instead of returning an error?

FWIW Jackson can also work with XML, I think the code is using JAXB instead for XML. Maybe using Jackson for everything could simplify the code.

jakub-bochenski avatar May 22 '25 14:05 jakub-bochenski

Yet I don't understand why the upload completes successfully instead of returning an error?

Only the schema validation happens synchronously with the request. The schema validation logic is able to correctly determine the file format because it's "streaming" through the file until it finds what it's looking for. For all it's concerned, your file is valid.

Processing of the file happens asynchronously (doing it synchronously would have huge potential to time out and block server threads). For processing, the code still uses the wonky library code I linked above.

FWIW Jackson can also work with XML, I think the code is using JAXB instead for XML. Maybe using Jackson for everything could simplify the code.

jackson-databind uses Woodstox behind the scenes for XML and does not expose a streaming API on its own. We use Woodstox directly here because we want to avoid expensive copying or even deserialization of the BOM file.

There are cases where uploaded files will be hundreds of megabytes in size. What we want to avoid is spending lots of resources on files that aren't even valid. Even supposedly simple actions like this cause the entire file content to be copied.

nscuro avatar May 22 '25 15:05 nscuro

Processing of the file happens asynchronously (doing it synchronously would have huge potential to time out and block server threads).

Maybe the /token API could be extended to report processing errors?

We use Woodstox directly here because we want to avoid expensive copying or even deserialization of the BOM file.

Makes sense

jakub-bochenski avatar May 22 '25 15:05 jakub-bochenski

Maybe the /token API could be extended to report processing errors?

This will come in v5, and in fact it will come for many more asynchronous processes, not just BOM uploads.

nscuro avatar May 22 '25 15:05 nscuro

FTR the same affects VEX upload

jakub-bochenski avatar May 23 '25 10:05 jakub-bochenski