nibabel icon indicating copy to clipboard operation
nibabel copied to clipboard

FIX: Change reader to also support tck files from mrtrix3-dev tckedit

Open mattcieslak opened this issue 4 years ago • 8 comments

The mrtrix3-dev and 3Tissue versions added a new "END" string to the header of tck files that come out of tckedit. This was causing an error in the tck reader. This fix will be compatible with new and old versions of the tck file format.

mattcieslak avatar Sep 14 '20 13:09 mattcieslak

Codecov Report

Merging #957 into master will decrease coverage by 0.04%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #957      +/-   ##
==========================================
- Coverage   91.87%   91.83%   -0.05%     
==========================================
  Files          98       98              
  Lines       12451    12451              
  Branches     2193     2193              
==========================================
- Hits        11439    11434       -5     
- Misses        678      681       +3     
- Partials      334      336       +2     
Impacted Files Coverage Δ
nibabel/streamlines/tck.py 99.46% <100.00%> (ø)
nibabel/nifti1.py 92.10% <0.00%> (-0.46%) :arrow_down:
nibabel/volumeutils.py 84.13% <0.00%> (-0.39%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 2f918b1...1c967dd. Read the comment docs.

codecov[bot] avatar Sep 14 '20 13:09 codecov[bot]

Attaching an example tck file (3Tissue-tckedit-streamlines-ex.tck) that generates an error with the tck reader. This file was generated with 3Tissue's tckedit @thijsdhollander

And here is the error:

In [6]: nibabel.streamlines.load('3Tissue-tckedit-streamlines-ex.tck')

ValueError Traceback (most recent call last) in ----> 1 nibabel.streamlines.load('3Tissue-tckedit-streamlines-ex.tck')

~/miniconda3/lib/python3.7/site-packages/nibabel/streamlines/init.py in load(fileobj, lazy_load) 93 raise ValueError("Unknown format for 'fileobj': {}".format(fileobj)) 94 ---> 95 return tractogram_file.load(fileobj, lazy_load=lazy_load) 96 97

~/miniconda3/lib/python3.7/site-packages/nibabel/streamlines/tck.py in load(cls, fileobj, lazy_load) 138 have referred to a corner. 139 """ --> 140 hdr = cls._read_header(fileobj) 141 142 if lazy_load:

~/miniconda3/lib/python3.7/site-packages/nibabel/streamlines/tck.py in _read_header(fileobj) 320 321 # Build header dictionary from the buffer. --> 322 hdr = dict(item.split(': ') for item in buf.rstrip().split('\n')[:-1]) 323 hdr[Field.MAGIC_NUMBER] = magic_number 324

ValueError: dictionary update sequence element #1 has length 1; 2 is required

3Tissue-tckedit-streamlines-ex.tck.zip

valeriejill avatar Sep 14 '20 13:09 valeriejill

The END is already accounted for with the [:-1]. The failing line is 'tckedit ZAPR01-FOD-Template-AmygdalaTargetMask-LHAmygdala-tracts.tck -number 5 tckedit-streamlines.tck (version=3Tissue_v5.2.8)'.

I would imagine what you want is:

>>> print(hdr['command_history'])
tckgen -angle 22.5 -maxlen 250 -minlen 10 -power 1.0 /mnt/jux/cnds/ZAPR01/mrtrix_ss3t_csd/FOD_PopulationTemplate/ZAPR01_FOD_Template.mif -seed_image /mnt/jux/cnds/ZAPR01/mrtrix_ss3t_csd/FOD_PopulationTemplate/ZAPR01_FOD_Template_Mask.mif -mask /mnt/jux/cnds/ZAPR01/mrtrix_ss3t_csd/FOD_PopulationTemplate/ZAPR01_FOD_Template_Mask.mif -select 2000000 -cutoff 0.10 /mnt/jux/cnds/ZAPR01/mrtrix_ss3t_csd/Tractography_PopulationTemplate/ZAPR01-FOD-Template-tractography-newparams.tck  (version=3Tissue_v5.2.8)
tckedit ZAPR01-FOD-Template-AmygdalaTargetMask-LHAmygdala-tracts.tck -number 5 tckedit-streamlines.tck  (version=3Tissue_v5.2.8)

The current approach will lose that second line. I think we probably need to treat command_history as a special case (unless any field is permitted to have newlines?) and append lines to a list until another key: value line is found.

effigies avatar Sep 14 '20 15:09 effigies

Since command history isn't a standard part of streamlines data, I think it's reasonable to not fully support it. There definitely isn't anything comparable that will come out of trk headers

mattcieslak avatar Sep 14 '20 16:09 mattcieslak

In general we do try to preserve format-specific metadata. I would rather not drop it when the fix is pretty easy:

        hdr = {}
        key = None
        for line in buf.rstrip().split("\n")[:-1]:
            fields = line.split(": ", 1)
            if len(fields) == 2:
                key = fields[0]
                hdr[key] = fields[1]
            else:
                hdr[key] += f"\n{fields[0]}"

(As an example. It might be doable with a little more readability.)

effigies avatar Sep 14 '20 19:09 effigies

Matt, WDYT about supporting multi-line header fields? I can submit a PR against your branch if you would prefer.

effigies avatar Sep 18 '20 12:09 effigies

I agree with @effigies, we should preserve format-specific metadata as much as possible. I think what you propose is the right direction.

MarcCote avatar Dec 03 '20 04:12 MarcCote

I completely overlooked this thread, even though I was tagged early on (my bad)! In any case, the command_history thing was indeed something that was a temporary reality in mrtrix3-dev, at the point where mrtrix3tissue branched off it. Eventually, mrtrix3tissue will catch up again when I get round to add/improve some other stuff; but I've in the meantime had the same issue / request reported by others who were urgently looking to make these .tck files play well with NiBabel. To patch for the time being, I've cherry picked the relevant existing fixes from mrtrix3 in a patch, and marked it with a 5.2.9 release to document this (https://github.com/3Tissue/MRtrix3Tissue/releases/tag/3Tissue_v5.2.9).

Note that this solution also didn't play 100% perfectly with NiBabel, but at least files weren't marked as corrupted anymore. Before the fix, the issue was indeed a newline character within the value of a specific key/field, leading to it being marked as corrupt. After the fix, the new reality now is multiple lines with the same key ('command_history'), which currently is not marked as corrupt by NiBabel, but all but the first line are ignored by NiBabel, I believe.

So to be "ultimately" compatible or robust, either case (multi-line header fields as well as duplicate keys) would best be detected and essentially be treated in the same way, conceptually as a "multi-line field".

thijsdhollander avatar Dec 03 '20 05:12 thijsdhollander