highdicom
highdicom copied to clipboard
LegacyConvertedEnhancedMRImage does not copy Private Tag Creator
Thanks for providing a converted for 'CLASSIC' MR Image Storage instances.
I am confused with the documentation for using this class, here is what I did (debian/buster):
$ pip3 install highdicom
$ mkdir /tmp/mr
$ cp gdcmData/*FileSeq* /tmp/mr
$ python3 conv.py
/home/mathieu/.local/lib/python3.7/site-packages/pydicom/dataset.py:1981: UserWarning: Camel case attribute 'DICOMPrefix' used which is not in the element keyword data dictionary
warnings.warn(msg)
/home/mathieu/.local/lib/python3.7/site-packages/pydicom/dataset.py:1981: UserWarning: Camel case attribute 'FilePreamble' used which is not in the element keyword data dictionary
warnings.warn(msg)
/home/mathieu/.local/lib/python3.7/site-packages/pydicom/dataset.py:1981: UserWarning: Camel case attribute 'FrameVolumeBasedCalculationTechnique' used which is not in the element keyword data dictionary
warnings.warn(msg)
Which resulted in (truncated):
(0020,9171) SQ (Sequence with explicit length #=1) # 568, 1 UnassignedPerFrameConvertedAttributesSequence
(fffe,e000) na (Item with explicit length #=9) # 560, 1 Item
(0009,1015) ?? 30\33\33\53\36\39\4d\52\30\31\32\30\30\32\30\36\31\39\31\38\32\33... # 26, 1 Unknown Tag & Data
(0019,1212) ?? 30\30\31\2e\31\38\32\36\39\36\45\2d\34\32 # 14, 1 Unknown Tag & Data
(0020,0030) DS [-01.190625E+02\-1.190625E+02\-3.553830E+01] # 42, 3 RETIRED_ImagePosition
(0020,0050) DS [003.553830E+01] # 14, 1 RETIRED_Location
(0020,1041) DS [003.553830E+01] # 14, 1 SliceLocation
(0021,1160) ?? 30\30\30\2e\30\30\30\30\30\30\45\2b\30\30\5c\30\30\2e\30\30\30\30... # 42, 1 Unknown Tag & Data
(0021,1163) ?? 30\30\33\2e\35\35\33\38\33\30\45\2b\30\31 # 14, 1 Unknown Tag & Data
(0021,1342) ?? 20\20\20\20\31\35 # 6, 1 Unknown Tag & Data
(0051,1010) ?? 30\30\33\37\32\37\36\5c\46\20\37\39\59\5c\48\2d\53\50\2d\43\52\5c... # 316, 1 Unknown Tag & Data
(fffe,e00d) na (ItemDelimitationItem for re-encoding) # 0, 0 ItemDelimitationItem
The output file is missing Private Creator for nested DataSet.
If using ExplicitVRLittleEndian, here is what I get:
$ dcmdump enh.dcm
W: DcmItem: Non-standard VR ' ' (0a\00) encountered while parsing element (0008,0005), assuming 2 byte length field
W: DcmItem: Non-standard VR 'IR' (49\52) encountered while parsing element (5349,5f4f), assuming 4 byte length field
E: DcmElement: Unknown Tag & Data (5349,5f4f) larger (536624) than remaining bytes in file
E: dcmdump: I/O suspension or premature end of stream: reading file: enh.dcm
Could you please add a section in the documentation on how to use LegacyConvertedEnhancedMRImage class ? Thanks much.
For reference:
$ cat conv.py
from pathlib import Path
from pydicom.filereader import dcmread
from pydicom import uid
from highdicom.legacy.sop import LegacyConvertedEnhancedMRImage
series_dir = Path('/tmp/mr')
image_files = series_dir.glob('*.dcm')
image_datasets = [dcmread(str(f)) for f in image_files]
enh = LegacyConvertedEnhancedMRImage(image_datasets, "1.2.3", "1", "4.5.6", "2")
# enh.file_meta.TransferSyntaxUID = uid.ExplicitVRLittleEndian
enh.save_as("enh.dcm")
@malaterre we have been working on the improvements to the legacy enhanced MF conversion functionality in #34. It is still not finished, but we are actively using and refining that PR as part of the effort to convert a subset of TCIA content into MF representation. Maybe that PR could be of interest to you as well.
@malaterre thanks for testing and for your feedback.
@fedorov I was wondering whether we should create a smaller PR for a subset of the changes included to #34 to address some of the issues with the LegacyConvertedEnhancedMRImage
?
Markus, @afshinmessiah is working on revisiting the PR mentioned above. We should be able to discuss with you soon the response to your comments, and see if we can include a subset of changes. I am meeting with Afshin today, and will will discuss this topic.
@fedorov @malaterre by looking at the provided output, I am not sure what the problem is and whether it's a problem in highdicom or pydicom. @malaterre what happens when you just read the individual image files into pydicom datasets and write them back to disk? Does this work?
@hackermd I believe this is related to the pydicom version I was using. Maybe highdicom
depends on a minimal pydicom version. That may solve the issue with the Explicit TS I was using (wild guess, I do not use pydicom).
As for the Implicit TS report, I believe there is a simple fix to include in the nested sequence (maybe solved in #34 ?) where Private Creator needs to be copied as well.
I can reproduce the issue with 0.7.0 today:
(0020,9171) SQ (Sequence with explicit length #=1) # 568, 1 UnassignedPerFrameConvertedAttributesSequence
(fffe,e000) na (Item with explicit length #=9) # 560, 1 Item
(0009,1015) ?? 30\33\33\53\36\39\4d\52\30\31\32\30\30\32\30\36\31\39\31\38\32\33... # 26, 1 Unknown Tag & Data
(0019,1212) ?? 30\30\36\2e\32\32\31\37\36\35\45\2d\34\33 # 14, 1 Unknown Tag & Data
(0020,0030) DS [-01.190625E+02\-1.190625E+02\-3.003830E+01] # 42, 3 RETIRED_ImagePosition
(0020,0050) DS [003.003830E+01] # 14, 1 RETIRED_Location
(0020,1041) DS [003.003830E+01] # 14, 1 SliceLocation
(0021,1160) ?? 30\30\30\2e\30\30\30\30\30\30\45\2b\30\30\5c\30\30\2e\30\30\30\30... # 42, 1 Unknown Tag & Data
(0021,1163) ?? 30\30\33\2e\30\30\33\38\33\30\45\2b\30\31 # 14, 1 Unknown Tag & Data
(0021,1342) ?? 20\20\20\20\20\35 # 6, 1 Unknown Tag & Data
(0051,1010) ?? 30\30\33\37\32\37\36\5c\46\20\37\39\59\5c\48\2d\53\50\2d\43\52\5c... # 316, 1 Unknown Tag & Data
(fffe,e00d) na (ItemDelimitationItem for re-encoding) # 0, 0 ItemDelimitationItem
@malaterre sorry for my delayed reply. We are currently focusing on the decoding API and will get back to legacy conversion once https://github.com/MGHComputationalPathology/highdicom/pull/93 is completed.
Please note that with the next release the package will depend on pydicom>=2.2
. The latest pydicom release included several bug fixes and features that we now rely on, including better checks for value representations.
@hackermd The issue is right here:
for tag, dataelements in unassigned_dataelements.items():
values = [str(da.value) for da in dataelements]
unique_values = set(values)
if len(unique_values) == 1:
unassigned_shared_ca_item.add(dataelements[0])
else:
for i, da in enumerate(dataelements):
unassigned_perframe_ca_items[i].add(da)
So typical Private Creator (eg. (0009,0012) LO [SIEMENS CM VA0 CMS] # 20, 1 PrivateCreator
), only end up in unassigned_shared_ca_item
(obviously). I suspect changing it to the following should work (pseudo code):
if len(unique_values) == 1:
unassigned_shared_ca_item.add(dataelements[0])
else:
for i, da in enumerate(dataelements):
if( da.is_private ):
unassigned_perframe_ca_items[i].add( get_private_creator_for(unassigned_shared_ca_item, da) )
unassigned_perframe_ca_items[i].add(da)
Documented at:
- http://dicom.nema.org/medical/dicom/current/output/chtml/part03/sect_C.7.6.16.2.html#sect_C.7.6.16.2.25
The necessary Private Creator Data Element within each Sequence Item accompanies Private Data Elements. There is no requirement to preserve the private block of Data Elements used in the Classic images to be converted. Nor is there a requirement that a Private Data Element use the same private block in all of the Classic images.