Solution for exercise 2 in basic_data_input.ipynb does not demonstrate how to use SelectFiles

Open Shotgunosine opened this issue 7 years ago • 1 comments

The provided solution for exercise 2 in base_data_input.ipynb is as follows:

from nipype import SelectFiles, Node

# String template with {}-based strings
templates = {'anat': 'sub-01/ses-*/anat/sub-01_ses-*_T1w.nii.gz'}
             
# Create SelectFiles node
sf = Node(SelectFiles(templates),
          name='selectfiles')

# Location of the dataset folder
sf.inputs.base_directory = '/data/ds000114'

#sf.inputs.ses_name = 

sf.run().outputs

In particular the template line is not using the string formatting feature of SelectFiles at all. It seems like the SelectFiles interface differs from the DataGrabber interface in that it doesn't expand lists in its inputs automatically. It may be that SelectFiles is just busted, but the only way I could get SelectFiles to accept an iterable input was to wrap it in a MapNode like this:

from nipype import MapNode
template = {'anat': 'sub-{subject_id:02d}/ses-{ses_name}/anat/*_T1w.nii.gz'}

sf = MapNode(SelectFiles(template, 
                 force_lists=True,
                 base_directory='/data/ds000114/'),
             iterfield=['subject_id', 'ses_name'],
             name='select_files')
sf.inputs.subject_id = [1, 1]
sf.inputs.ses_name = ['test', 'retest']

sf_res = sf.run()
sf_res.outputs

This seems like kind of a crappy solution though, because if you want to get more than one subject you're typing out:

sf.inputs.subject_id = [1, 1, 2, 2]
sf.inputs.ses_name = ['test', 'retest', 'test', 'retest']

Alternatively, there is a solution using iterables and a JoinNode, but it's pretty ugly and completely specific to the case of just grabbing anats:

# write your solution here

from nipype import JoinNode, Node, Workflow, Function
template = {'anat': 'sub-{subject_id:02d}/ses-{ses_name}/anat/*_T1w.nii.gz'}

sf = Node(SelectFiles(template, 
                 force_lists=True,
                 base_directory='/data/ds000114/'),
             name='select_files')
sf.iterables = [('subject_id', (1,2)),
                ('ses_name', ('test', 'retest'))]

combine = lambda anat:list(anat)
jn = JoinNode(Function(input_names=['anat'],
                       output_names=['anat'],
                       function=combine),
              name='join_anat',
              joinsource=sf,
              joinfield='anat'
             )

def unpack_list(x):
    out_list = []
    for xx in x:
        out_list.extend(xx)
    return out_list

un = Node(Function(input_names=['x'],
                   output_names=['anat'],
                   function=unpack_list),
         name='unpack_anat')
wfsf = Workflow('sf_iterable', base_dir='/output/working_dir/')
wfsf.connect([(sf, jn, [('anat', 'anat')]),
              (jn, un, [('anat', 'x')])])
sf_res = wfsf.run()
print([nn.result.outputs for nn in list(sf_res.nodes) if nn.name == 'unpack_anat'])

So my question is, what's going on here? I feel like I must be misunderstanding how to use SelectFiles.

Also, the tutorial should maybe point out the potential bug in which you specify base_dir inside the SelectFiles definition instead of base_directory.

Oct 03 '18 21:10 Shotgunosine

Hi @Shotgunosine - That's a very good question! My intuition was to either use a MapNode or to have a node with iterables that feeds the input from a list individually to the node.

But I think there's even a simpler solution to this: I think SelectFiles comes already with iterables functionality, without using MapNode. An example of this can be found here. But to be honest, I wasn't able to recreate this behavior in a standalone script. Do you have any luck?

There's also another approach shown here but I'm not sure if this is from an older version, as it doesn't work for me anymore.

Oct 04 '18 17:10 miykael