dicomweb-client
dicomweb-client copied to clipboard
fix(dicomweb-client):Make the retrieveBulkData handle either single or multipart responses.
The DICOMweb standard can be interpreted in a few ways, one of which is that BulkDataURI always returns single part bulk data when requested as a full URL. This interpretation doesn't work for the existing retrieveBulkDataURI, so the library has been updated to allow retrieving as either single or multipart, letting the server decide how to return the data.
The "Value" field of a DICOM JSON representation of a Data Element is an array (see F.2.2 DICOM JSON Model Object Structure and F.2.4 DICOM JSON Value Multiplicity). If the object contains the "BulkDataURI" instead of the "Value" field, then it would also make sense for the response message to the bulkdata retrieve request to be multipart.
While the standard may also support singlepart messages, an origin server would still have to support multipart messages. Therefore, I don't see why the behavior of the client has to change.
CP 2204 available at https://docs.google.com/document/d/1UbAZFAyTF8p7Vf_q_lR9mex82Zkr7zB-/edit?usp=sharing&ouid=106394421000961430700&rtpof=true&sd=true specifies the single part retrieve that is setup here.
CP 2204 available at https://docs.google.com/document/d/1UbAZFAyTF8p7Vf_q_lR9mex82Zkr7zB-/edit?usp=sharing&ouid=106394421000961430700&rtpof=true&sd=true specifies the single part retrieve that is setup here.
Thanks @wayfarer3130 for sharing the draft of the CP.
Section 10.4.3.3.5 states:
The payload for a Bulkdata Resource (see Section 10.4.1.1.5) shall contain all the bulkdata for the resource. When the resource is a single Bulkdata URI, the payload will contain the single corresponding element. When the resource is a Study, Series or Instance Bulkdata resource, the payload will contain all the bulkdata of the corresponding instance(s). Bulkdata in a multipart response shall have a Content-Location header field that corresponds to the URI contained in the corresponding Element in the Metadata.
The wording is indeed confusing and should be improved in my opinion. I think the confusion is primarily due to the fact that the term "bulkdata" is used to refer to two different things:
- Bulkdata resources that can be accessed via the DICOMweb API at the
/bulkdataendpoints. In this case, a resource represents a collection of DICOM Data Elements. - Bulkdata values that can be accessed via a Bulkdata URI that's included in a JSON metadata resource. In this case, a resource represents the value of an individual DICOM Data Element.
Table 10.4.4-1 explicitly requires the media type of any bulkdata resource to be multipart/related; type="application/octet-stream".
For (1), it has to be multi-part because a study, series, and instance may contain more than one bulkdata element. Since an element may have a value multiplicity greater than one and values in JSON metadata are arrays, I tend argue that the payload of the response messages shall also be multi-part for (2). In general, the client may not know whether the value is single or multi-part.
@hackermd - the problem with things like: Table 10.4.4-1 explicitly requires the media type of any bulkdata resource to be multipart/related; type="application/octet-stream". is that they aren't definitive because there are other tables that allow it like: https://dicom.nema.org/medical/dicom/current/output/html/part18.html#table_8.7.3-4 as well as other places that suggest that it is required to support single part bulkdatauri requests.
The general approach of the DICOM standards committee to things that have been allowed by the standard, such as both the relative URI and single part bulkdata is to try to clarify how it is used, but not restrict things that were previously allowed under a reasonable reading of the standard.
Returning single part is extremely useful for anything that does a direct render like PDF, video, audio etc, and having the dcmjs library able to deal with either type is useful for when one sometimes wants to retrieve things for direct render and sometimes does background retrieves. In theory, the implementation on the DICOMweb side has to handle both, but implementations such as the static wado that stores the data in the expected final format, compressed, are far faster when they don't need to wrap and uncompress the data.
Table 10.4.4-1 explicitly requires the media type of any bulkdata resource to be multipart/related; type="application/octet-stream". is that they aren't definitive because there are other tables that allow it like: https://dicom.nema.org/medical/dicom/current/output/html/part18.html#table_8.7.3-4
I think there is a misunderstanding. The "Media Type" and "Transfer Syntax" columns of the tables in this section of the standard may refer to individual message parts, i.e., multipart/related; type=${MediaType}; transfer-syntax=${TransferSyntax}.
Section 8.7.3 DICOM Media Type Sets states:
Depending on the service, the media types may be single part or multipart, and may have required or optional Transfer Syntax and/or character set parameters.
Returning single part is extremely useful for anything that does a direct render like PDF, video, audio etc
Rendered resources (endpoint /rendered) have always been single-part!
The
"Value"field of a DICOM JSON representation of a Data Element is an array (see F.2.2 DICOM JSON Model Object Structure and F.2.4 DICOM JSON Value Multiplicity). If the object contains the"BulkDataURI"instead of the"Value"field, then it would also make sense for the response message to the bulkdata retrieve request to be multipart.
That is a misunderstanding of the multiplicity and the binary representation - the BulkDataURI refers to a single object that contains the entire Value, all of the various parts in however many bytes plus multiplicity of values it contains. That is, if one has the value: CT\MR Then the literal bulk data URI value would be "CT\MR " (zero character at end for padding), without multiple values etc.
That is a misunderstanding of the multiplicity and the binary representation - the BulkDataURI refers to a single object that contains the entire Value, all of the various parts in however many bytes plus multiplicity of values it contains.
The BulkDataURI does not point to a single object, but to a single resource, which is defined to be a multi-part message.
I think you are right about the fact that values of a data element with value multiplicity greater than one would in principle be encoded in one message part: https://dicom.nema.org/medical/dicom/current/output/chtml/part19/chapter_A.html#para_2b996d78-0885-405e-a398-275654546c4b
However, data elements with a binary value representation (or large text) generally have a value multiplicity of 1 anyways.
The main reason the messages are multi-part is probably because the Pixel Data element may have multiple frames (and frame fragments). Each frame would need to be encoded as a separate message part.
That is a misunderstanding of the multiplicity and the binary representation - the BulkDataURI refers to a single object that contains the entire Value, all of the various parts in however many bytes plus multiplicity of values it contains.
The BulkDataURI does not point to a single object, but to a single resource, which is defined to be a multi-part message.
I think you are right about the fact that values of a data element with value multiplicity greater than one would in principle be encoded in one message part: https://dicom.nema.org/medical/dicom/current/output/chtml/part19/chapter_A.html#para_2b996d78-0885-405e-a398-275654546c4b
However, data elements with a binary value representation (or large text) generally have a value multiplicity of 1 anyways.
The main reason the messages are multi-part is probably because the Pixel Data element may have multiple frames (and frame fragments). Each frame would need to be encoded as a separate message part.
@hackermd Now that there is an actual change to the DICOM standard to support being able to fetch using either method, what's left to merge this change? I'll also need this change for a different project and I want to help to get this merged.
That is a misunderstanding of the multiplicity and the binary representation - the BulkDataURI refers to a single object that contains the entire Value, all of the various parts in however many bytes plus multiplicity of values it contains.
The BulkDataURI does not point to a single object, but to a single resource, which is defined to be a multi-part message. I think you are right about the fact that values of a data element with value multiplicity greater than one would in principle be encoded in one message part: https://dicom.nema.org/medical/dicom/current/output/chtml/part19/chapter_A.html#para_2b996d78-0885-405e-a398-275654546c4b However, data elements with a binary value representation (or large text) generally have a value multiplicity of 1 anyways. The main reason the messages are multi-part is probably because the Pixel Data element may have multiple frames (and frame fragments). Each frame would need to be encoded as a separate message part.
@hackermd Now that there is an actual change to the DICOM standard to support being able to fetch using either method, what's left to merge this change? I'll also need this change for a different project and I want to help to get this merged.
I updated the build to current. At this point it is mostly cleanup so that the handling of multipart and singlepart is based on the actual response type rather than having a decode step multiple places.
@pieper @swederik are you guys able to help review/approve this PR? not sure about Markus's availability nowadays.
@pieper @swederik are you guys able to help review/approve this PR? not sure about Markus's availability nowadays.
I read through this but did not test it. If you are okay with the change and nobody else objects I'm fine with merging.
LGTM 🏅 If possible I would like to know if @dclunie has any objections based on previous discussions otherwise I would like to have this merged.
I have no objections.
The final text of CP 2204 which amended the section describing the Media Types, including those for the Bulk Data Resources, should now make it clear that the origin server can return either therefore the client has to support both possibilities, making this PR necessary.
Having said that, I would have been OK with making the client more robust anyway, in the sense of the "Postel principle" (write perfect, read anything).
I have no objections.
The final text of CP 2204 which amended the section describing the Media Types, including those for the Bulk Data Resources, should now make it clear that the origin server can return either therefore the client has to support both possibilities, making this PR necessary.
Having said that, I would have been OK with making the client more robust anyway, in the sense of the "Postel principle" (write perfect, read anything).
@pieper based on this reply + my testing I think we can merge this one.
@hackermd still has one pending request that's possibly been addressed. Since it sounds like this is an improvement I'll merge it and we can open new issues for anything not yet resolved.
@pieper do we have to manually publish the new version or it's should be triggered by a semantic commit?
I think @dannyrb only set up the automatic publishing for dcmjs but not this repo. Maybe somebody who is good with these things could help automate?
I think @dannyrb only set up the automatic publishing for dcmjs but not this repo. Maybe somebody who is good with these things could help automate?
Hi Steve :wave: looks like that dicomweb PR failed to publish a new npm version: https://github.com/dcmjs-org/dicomweb-client/actions/runs/5683626006
[5:28:47 PM] [semantic-release] [@semantic-release/commit-analyzer] › ℹ The commit should not trigger a release
[5:28:47 PM] [semantic-release] [@semantic-release/commit-analyzer] › ℹ Analysis of 1 commits complete: no release
[5:28:47 PM] [semantic-release] › ✔ Completed step "analyzeCommits" of plugin "@semantic-release/commit-analyzer"
[5:28:47 PM] [semantic-release] › ℹ There are no relevant changes, so no new version is released.
There are no relevant changes, so no new version is released.
I just noticed that the last commits from that PR were not in the correct format, that might be the cause
Hmm, I'm not sure either. To my quick review all the commit messages look good: https://github.com/dcmjs-org/dicomweb-client/commits/master
Hmm, I'm not sure either. To my quick review all the commit messages look good: https://github.com/dcmjs-org/dicomweb-client/commits/master
You are right, the squashed commit looks fine. The last published package is from April 24 though. I don't have enough permissions to take a look myself but maybe there's something wrong with the npm account? or the commit-analyzer? (the publish last action said that there were no relevant changes, skipping the version bump)
:tada: This PR is included in version 0.10.3 :tada:
The release is available on:
Your semantic-release bot :package::rocket:
@igoroctaviano Hmm, I guess whatever it was got fixed since that doc ts change resulted in a new release. 🤷