dicom-rs
dicom-rs copied to clipboard
PS3.5 H3.2 compliant
H3 of PS3.5 indicates that multiple encoding formats are applied within the Patient Name. https://dicom.nema.org/medical/dicom/current/output/chtml/part05/sect_H.3.html https://dicom.nema.org/medical/dicom/current/output/chtml/part05/sect_H.3.2.html
Currently, although it is possible to obtain multiple Specification Character Sets as Strs, decoding in the above case seems to decode at the first element of the Specification Character Set.
I would like to be able to decode the Patient Name in this PR in the correct way when it is encoded in multiple encoding formats.
It is likely that PS3.5 will be able to promptly support PS3.5 Chapter I Korean and Chapter J Chinese by utilizing this support.
@Enet4 Hello. I would like to do what we could not do in PR #444 here.
Let me ask you two questions.
- I have a function called read_value_pns. I want to execute this function when
VR::PN. If Patient Name does not contain delimiter =,read_value_pnsexecutesread_value_strsas before. Is this an acceptable method of implementation for you? If this is a way to deviate from the design concept, I would be glad to get better advice.- ~~If Patient Name does not contain delimiter =, read_value_pns executes read_value_str as before.~~ I found that
read_value_pnsreads the value from the buffer, so it would be a bug ifread_value_strswas executed afterwards. I will come up with a fix for this.
- ~~If Patient Name does not contain delimiter =, read_value_pns executes read_value_str as before.~~ I found that
- (If the answer to the first question is yes) Is there any way to get all the values of a
Specification Character Setin aread_value_pns? The current implementation ofStatefulDecoderkeeps only the first one, even if there are multipleSpecific Character Sets, as shown here. Since changingself.textfromSpecificCharacterSettoVec<SpecificCharacterSet>is a big change, if there is another way to get the value of theSpecification Character SetI would prefer to use that method. I would prefer to use that method if I can get theSpecification Character Setvalue another way.
~~If Patient Name does not contain delimiter =, read_value_pns executes read_value_str as before.~~ I found that read_value_pns reads the value from the buffer, so it would be a bug if read_value_strs was executed afterwards. I will come up with a fix for this.
The process after reading information from the buffer of read_value_strs is split into separate processes named read_value_strs_impl, which can be called from both read_value_strs and read_value_pns.
https://github.com/Enet4/dicom-rs/pull/445/files#diff-65231185718d602fd3a63db21c6076af076fd9afb6c3d4303c6baeaffa2e1e13R385
(If the answer to the first question is yes) Is there any way to get all the values of a Specification Character Set in a read_value_pns? The current implementation of StatefulDecoder keeps only the first one, even if there are multiple Specific Character Sets, as shown here. Since changing self.text from SpecificCharacterSet to Vec<SpecificCharacterSet> is a big change, if there is another way to get the value of the Specification Character Set I would prefer to use that method. I would prefer to use that method if I can get the Specification Character Set value another way.
- I added a member of type
Vec<TC>calledcsto StatefulDecoder, whereSpecification Character Setis added each timeset_character_setis executed. And modifiedread_value_pnsto decode the character using the value ofcs. - I understand that
set_character_setis executed byread_value_cs. Fixedset_character_setto run on all character encoding information present in theSpecification Character Set, sinceread_value_cswas implemented in such a way that only the first element was added to thetext. Fixed.
@Enet4 Implementation of this PR is complete. I would appreciate a review when you have time.
Hello, @9enki! I'll be looking into DICOM-rs matters the upcoming weeks after a small hiatus.
- I have a function called read_value_pns. I want to execute this function when
VR::PN. If Patient Name does not contain delimiter =,read_value_pnsexecutesread_value_strsas before. Is this an acceptable method of implementation for you? If this is a way to deviate from the design concept, I would be glad to get better advice.
The way I see it, the concept of Person Name should include all component groups, rather than just one (alphabetic, ideographic, and phonetic), so they would all sit in a single PrimitiveValue. So, once we find out that we have component groups other than the alphabetical one, we decode each group and merge them back together. From the little I saw of the code so far, that seems to be the intended way to go. In any case, I am likely to give it another look later.
- Is there any way to get all the values of a
Specification Character Setin aread_value_pns? The current implementation ofStatefulDecoderkeeps only the first one, even if there are multipleSpecific Character Sets, as shown here. Since changingself.textfromSpecificCharacterSettoVec<SpecificCharacterSet>is a big change, if there is another way to get the value of theSpecification Character SetI would prefer to use that method. I would prefer to use that method if I can get theSpecification Character Setvalue another way.
The main problem is that SpecificCharacterSet is too simple right now. The way forward is to adjust this data type to support more than one character set value. The steps would be something like this:
- The existing idea of a
SpecificCharacterSetwould become aCharacterSetCode(a single encoding). SpecificCharacterSetis recreated as an opaque structure that can hold up to n codes (depending on which they are; inISO_IR 192and some others, only 1 value would be acceptable).
Aside from the fact that one would no longer be able to introspect into the variants available, I think that most of the API would stay intact.
#[derive(Debug)]
pub struct SpecificCharacterSet(SmallVec<[CharacterSetCode; 2]>);
As we are now gathering enough material for a major version (0.7), we can proceed with breaking changes on SpecificCharacterSet and PersonName to fulfill these needs.
If any further guidance is needed, feel free to let me know.
Hello again. This contribution is a candidate for inclusion in 0.7.0, but then we need to make the proposed change of changing SpecificCharacterSet so that the type becomes opaque and a single value encompasses the possibility of multiple character set codings. Please let me know if you would be able to fulfill this or you would prefer someone else to take over.
Just letting you know that #489 was merged. It made SpecificCharacterSet opaque, so further contributions on text encoding can be brought in without breaking changes to the API.