brainiak
brainiak copied to clipboard
FCMA feature selection excludes the best performed voxel
Hi,
I found this potential issue for FCMA feature selection step, which may lead to excluding the best performed voxel when selecting the top k number of voxels.
At the end of the fcma_voxel_selection_cv.py
:
with open(file_str + 'result_list.txt', 'w') as fp:
for idx, tuple in enumerate(results):
fp.write(str(tuple[0]) + ' ' + str(tuple[1]) + '\n')
# Store the score for each voxel
score[tuple[0]] = tuple[1]
seq[tuple[0]] = idx
result
is an iterator of tuples. tuple[0] is the voxel ID, which index the voxel, tuple[1] is that voxel's score. The tuples are ranked, such that the highest performed voxel would be ranked at the top, thus when being enumerated, the best performed voxel would have idx = 0
. As a result, seq[tuple[0]] = idx
would assign the best performed voxel the rank of 0.
Then when using fslmaths
to select the top k number of voxels, as in make_top_voxel_mask.sh
:
for file in ${input_dir}/*_seq.nii.gz
do
# Preprocess the file name
fbase=$(basename "$file")
pref="${fbase%%.*}"
# Create the voxel mask
fslmaths $file -uthr $voxel_number -bin ${output_dir}/${pref}_top${voxel_number}.nii.gz
done
-uthr
would up-threshold the input file based on the voxel_number
input. For example, it k = 3000
, -uthr
would select voxels that have the rank from 0-3000, including the top 3000 voxels and all non-brain voxels, which also have the value of 0. Then -bin
would binarize the file into a mask, excluding all voxels that have 0 value, including the non-brain voxels and the best performed voxel which has the value of 0 because it ranks 0. In this way, I believe FCMA feature selection would exclude the top-performed voxel.
If I was correct about this issue, the solution should be pretty simple, and can be done: (just added +1
to idx)
with open(file_str + 'result_list.txt', 'w') as fp:
for idx, tuple in enumerate(results):
fp.write(str(tuple[0]) + ' ' + str(tuple[1]) + '\n')
# Store the score for each voxel
score[tuple[0]] = tuple[1]
seq[tuple[0]] = idx + 1
Please let me know if this does or doesn't makes any sense or if I misunderstood the script and this is not a potential issue. Thank you all very much!
@yidawang are you able to look at this?
The description makes sense to me. I couldn't remember all the details of how fslmaths works. If it is as described above, I am fine with the proposed fix with a 1-based index system. Please submit a PR to fix it. Thanks!