aslprep icon indicating copy to clipboard operation
aslprep copied to clipboard

unequal number of control and label

Open a3sha2 opened this issue 4 years ago • 8 comments

@caugolm

a3sha2 avatar Aug 27 '20 13:08 a3sha2

Usually this is because an M0 is added to the image series, or a scan was broken off prematurely, or not all Dicoms were copied correctly. Should be easy to find out which of these was the issue.

HenkMutsaerts avatar Sep 23 '20 20:09 HenkMutsaerts

Following up on this, (belatedly), but: @a3sha2 thanks for the new release (I just built 0.2.6.1)! I've been testing ASLPREP with my "unequal" label/control dataset again. It looks like if I only included an even number of "label" and "control" lines (ie, "number of volumes - 1") in my aslcontext.tsv, ASLPREP will (quietly) ignore the final (un-paired) volume; this, in my opinion is desired behavior. But, if I specify "--dummy-vols 1", and specify "number of volumes - 1" in the aslcontext.tsv, ASLPREP seems to remove the first 3 volumes, which is a little confusing to me; yet, if I specify "--dummy-vols 1", and specify all volumes in the aslcontext.tsv, ASLPREP crashes due to an unequal number of label and control values ("File "/usr/local/miniconda/lib/python3.7/site-packages/aslprep/interfaces/cbf_computation.py", line 162, in _run_interface cbf_data = np.subtract(control_img, label_img) ValueError: operands could not be broadcast together with shapes (64,64,34,14) (64,64,34,15)"

In any case, I'm content to specify just the desired volumes and let SCORE remove any outlying label/control pairs (as opposed to hacking off the beginning of the time series), but if "--dummy-vols N" is intended to remove N pairs, as opposed to single volumes, that might be worth specifying in the documentation.

caugolm avatar Mar 05 '21 17:03 caugolm

thank you @caugolm We have release new one (0.2.7) that force bids validation expect if the users want to skip it. The bids-validator-1.6.2 is working perfectly with asl, many thanks to @HenkMutsaerts for pushing this through. I will suggest to validate your data with bids first, sometime _aslcontext.tsv may not be proper but bids-validator will identify issues if there is any .

Yes, --dummy-vols remove pair of control and label. We will introduced --dummy-scans like in fmriprep to remove first x volume instead of pair and both (dummyvol and dummyscan) can be used together. Dummyvol will be applied before CBF computation but dummyscan will be applied before any preprocessing.

a3sha2 avatar Mar 07 '21 05:03 a3sha2

Automatically removing any single control or label doesn't sound good, you should indeed know if there is a dummy scan or M0 included.

As ASL's label is created fresh, and tissue is subtracted, you don't need dummy scans like you do with fMRI. However, early Siemens ASL sequences were built as adaptation of an fMRI sequence, incorrectly including dummy scans and other fMRI pre processing options.

Hope this helps!

HenkMutsaerts avatar Mar 07 '21 07:03 HenkMutsaerts

@a3sha2 Thanks, using the bids validator should be a help!

@HenkMutsaerts Thanks so much for the info! In this particular case, the person programming the sequence asked how much time we wanted for ASL and filled that time with as many volumes as possible, naive to the fact that the sequence alternated between label and control volumes. This resulted in an extra label volume collected for a number of datasets. Once the "error" was noticed, the extra volume was dropped from the sequence (instead of adding an extra, I believe to keep the same number of usable pairs between the two versions of the sequence). So, it's not a "dummy volume" in the traditional sense, but more of an odd-volume-out that we figured would be worth sacrificing in order to keep consistent with more datasets and to be able to do data cleaning with SCORE (which uses the control-label difference images). The M0 for this acquisition is a separate image.

caugolm avatar Mar 07 '21 18:03 caugolm

I see, well you could do both.

The data cleaning may create an additional imbalance, and SNR always differs between volunteers, probably more so then the extra label image. I would always use as much SNR as you can get. I don't have experience with Sudipto's SCORE, but automatic removal of artifacts is tricky as they can also contain physiological information that deviates/is an outlier.

Sounds like you got it under control! Let me know if I can be of any help.

HenkMutsaerts avatar Mar 07 '21 19:03 HenkMutsaerts

Thanks so much for your thoughts on this! We have some scans from some patients with dementia that tend to move somewhat erratically, and some preliminary data indicates we're doing slightly better after removing some of the "noisy" pairs, but I'd absolutely agree that there's a concern of losing some real signal by censoring anything.

caugolm avatar Mar 08 '21 17:03 caugolm

(I preemptively apologize for how long this ended up)

I’ve been thinking about the dummy scans/volume issue a bit more and I think it makes more sense to have some additional flexibility, where it would be beneficial to censor specified volumes, not just the first ones. I think this for a couple of reasons:

  1. In the sequence with the uneven number of labels and controls, we’ve tended to expect the final volume would be worse than the first volume. With our patient population, it’s more likely a patient is moving towards the end of the acquisition, as opposed to the beginning, so being able to keep the first volume and drop the last one is preferable to dropping the first and keeping the last.
  2. We have an older pulse sequence with a few hundred scans collected. The pulse sequence had a cyclic lipid-shift artifact so to make things “balance” we need to crop off the final 10 volumes/5 pairs. In other words, the artifact cycle is about 17-18 pairs and we collected 40 pairs total, but later realized analyzing only 35 pairs resulted in better images. When we were first processing this data, we found that the artifact tended to be worse towards the end of the acquisition, so we chose to crop off the final 10 volumes, instead of the first 10. I intend to use ASLPREP to process this data, so allowing for this flexibility would be great for this data, too. For what it’s worth, I know Ted has worked with a similar sequence in the past and used a similar solution, though I’m not sure it’s data he’s continued to work with.
  3. I could also see this as a way for people to potentially use the ASLPREP carpet plots to help decide what data to use: ie, if someone sees a “funny” column or two in the html, they would have a relatively straightforward way to re-process the dataset with the bad one removed.

As to how to actually cope with these issues I see maybe 5 ways forward, though I’d love to hear any other ideas:

  1. Add something like a more flexible “—dummy-vols” argument to ASLPREP. Maybe phrasing along the lines of “—volumes-to-censor” would be more accurate. The argument could take in an individual value or a list of values that specified the volume number(s) in the time series to remove. I would see this as a relatively straightforward pre-preprocessing step, coded similarly to the current dummy-vols/skip_vols, perhaps with one exception: it could potentially lead to subtraction of control-label pairs “flipping” partway through the time series, potentially multiple times: eg, if someone has 5 volumes: “label control label control label”, and wants to remove the middle “label” then the subtraction order for the pairs would change. I’d be happy to help write the functions to achieve this if you could provide a little guidance about where to start.
  2. Add some type of volumes to censor argument into a fwheudiconv framework/heuristic so the BIDS data ends up only the volumes we are interested in. I’m not so familiar with fwheudiconv as it hasn’t supported ASL data (until recently, presumably?), but again, I’d be happy to help make this work. That said, this feels like it violates the BIDS philosophy because “curation” seems to apply to which acquisitions to use, but not to individual parts of acquisitions.
  3. Write up a sort of “ASL-BIDS cropper” gear that takes the BIDS-compliant output of fwheudiconv and turns it into the BIDS-compliant subset of the data we actually want to analyze. This is a relatively easy thing to do on the local cluster I’m using to run ASLPREP (though, if ASLPREP is destined for FlyWheel this may end up a headache to get run there)
  4. I can hack apart dicoms offline and basically pretend the data we don’t want to analyze was never acquired. I don’t really like this as it limits flexibility in the future, but on the plus side it’s relatively quick and then I don’t need to futz with the nifits/jsons/BIDS or ASLPREP for the issues for either of my pulse sequences.
  5. The short term solution seems to be to turn on the “—skip-bids-validation” flag and just specify the first N volumes I want to use in the aslcontext.tsv and hope for the best. This is quick and easy at the moment but I would assume this is the sort of loophole that ends up closed at some point, and also lacks any “provenance” regarding the decision to do this.

I’d really appreciate any thoughts on this! And again, I’m happy to help get any code for a generalizable solution written up.

caugolm avatar Mar 09 '21 17:03 caugolm