nibabel
nibabel copied to clipboard
FIX: Change reader to also support tck files from mrtrix3-dev tckedit
The mrtrix3-dev and 3Tissue versions added a new "END"
string to the header of tck files that come out of tckedit
. This was causing an error in the tck reader. This fix will be compatible with new and old versions of the tck file format.
Codecov Report
Merging #957 into master will decrease coverage by
0.04%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #957 +/- ##
==========================================
- Coverage 91.87% 91.83% -0.05%
==========================================
Files 98 98
Lines 12451 12451
Branches 2193 2193
==========================================
- Hits 11439 11434 -5
- Misses 678 681 +3
- Partials 334 336 +2
Impacted Files | Coverage Δ | |
---|---|---|
nibabel/streamlines/tck.py | 99.46% <100.00%> (ø) |
|
nibabel/nifti1.py | 92.10% <0.00%> (-0.46%) |
:arrow_down: |
nibabel/volumeutils.py | 84.13% <0.00%> (-0.39%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 2f918b1...1c967dd. Read the comment docs.
Attaching an example tck file (3Tissue-tckedit-streamlines-ex.tck) that generates an error with the tck reader. This file was generated with 3Tissue's tckedit @thijsdhollander
And here is the error:
In [6]: nibabel.streamlines.load('3Tissue-tckedit-streamlines-ex.tck')
ValueError Traceback (most recent call last)
~/miniconda3/lib/python3.7/site-packages/nibabel/streamlines/init.py in load(fileobj, lazy_load) 93 raise ValueError("Unknown format for 'fileobj': {}".format(fileobj)) 94 ---> 95 return tractogram_file.load(fileobj, lazy_load=lazy_load) 96 97
~/miniconda3/lib/python3.7/site-packages/nibabel/streamlines/tck.py in load(cls, fileobj, lazy_load) 138 have referred to a corner. 139 """ --> 140 hdr = cls._read_header(fileobj) 141 142 if lazy_load:
~/miniconda3/lib/python3.7/site-packages/nibabel/streamlines/tck.py in _read_header(fileobj) 320 321 # Build header dictionary from the buffer. --> 322 hdr = dict(item.split(': ') for item in buf.rstrip().split('\n')[:-1]) 323 hdr[Field.MAGIC_NUMBER] = magic_number 324
ValueError: dictionary update sequence element #1 has length 1; 2 is required
The END
is already accounted for with the [:-1]
. The failing line is 'tckedit ZAPR01-FOD-Template-AmygdalaTargetMask-LHAmygdala-tracts.tck -number 5 tckedit-streamlines.tck (version=3Tissue_v5.2.8)'.
I would imagine what you want is:
>>> print(hdr['command_history'])
tckgen -angle 22.5 -maxlen 250 -minlen 10 -power 1.0 /mnt/jux/cnds/ZAPR01/mrtrix_ss3t_csd/FOD_PopulationTemplate/ZAPR01_FOD_Template.mif -seed_image /mnt/jux/cnds/ZAPR01/mrtrix_ss3t_csd/FOD_PopulationTemplate/ZAPR01_FOD_Template_Mask.mif -mask /mnt/jux/cnds/ZAPR01/mrtrix_ss3t_csd/FOD_PopulationTemplate/ZAPR01_FOD_Template_Mask.mif -select 2000000 -cutoff 0.10 /mnt/jux/cnds/ZAPR01/mrtrix_ss3t_csd/Tractography_PopulationTemplate/ZAPR01-FOD-Template-tractography-newparams.tck (version=3Tissue_v5.2.8)
tckedit ZAPR01-FOD-Template-AmygdalaTargetMask-LHAmygdala-tracts.tck -number 5 tckedit-streamlines.tck (version=3Tissue_v5.2.8)
The current approach will lose that second line. I think we probably need to treat command_history
as a special case (unless any field is permitted to have newlines?) and append lines to a list until another key: value
line is found.
Since command history isn't a standard part of streamlines data, I think it's reasonable to not fully support it. There definitely isn't anything comparable that will come out of trk headers
In general we do try to preserve format-specific metadata. I would rather not drop it when the fix is pretty easy:
hdr = {}
key = None
for line in buf.rstrip().split("\n")[:-1]:
fields = line.split(": ", 1)
if len(fields) == 2:
key = fields[0]
hdr[key] = fields[1]
else:
hdr[key] += f"\n{fields[0]}"
(As an example. It might be doable with a little more readability.)
Matt, WDYT about supporting multi-line header fields? I can submit a PR against your branch if you would prefer.
I agree with @effigies, we should preserve format-specific metadata as much as possible. I think what you propose is the right direction.
I completely overlooked this thread, even though I was tagged early on (my bad)! In any case, the command_history
thing was indeed something that was a temporary reality in mrtrix3-dev, at the point where mrtrix3tissue branched off it. Eventually, mrtrix3tissue will catch up again when I get round to add/improve some other stuff; but I've in the meantime had the same issue / request reported by others who were urgently looking to make these .tck
files play well with NiBabel. To patch for the time being, I've cherry picked the relevant existing fixes from mrtrix3 in a patch, and marked it with a 5.2.9 release to document this (https://github.com/3Tissue/MRtrix3Tissue/releases/tag/3Tissue_v5.2.9).
Note that this solution also didn't play 100% perfectly with NiBabel, but at least files weren't marked as corrupted anymore. Before the fix, the issue was indeed a newline character within the value of a specific key/field, leading to it being marked as corrupt. After the fix, the new reality now is multiple lines with the same key ('command_history'), which currently is not marked as corrupt by NiBabel, but all but the first line are ignored by NiBabel, I believe.
So to be "ultimately" compatible or robust, either case (multi-line header fields as well as duplicate keys) would best be detected and essentially be treated in the same way, conceptually as a "multi-line field".