brainiak-tutorials
brainiak-tutorials copied to clipboard
05-Optimzation: Enhancements and fixes
The following items can be improved in this notebook:
-
[ ] The early sections "Recap" and "Dataset" are almost identical, so redundant
-
[ ] Exercise 1 Presumably the expectation is to separate the train/test sets for the classifier and also for the voxel selection. It might be worth emphasizing that using all the data for voxel selection is a common but subtle error. There are probably quite a few good examples in the literature that got past less technical reviewers In this example, I consistently get slightly below chance performance. I believe that this is driven by the cross-validation, see: Classification based hypothesis testing in neuroscience: Below‐chance level classification rates and overlooked statistical properties of linear parametric classifiers. HBM 2016 Another subtle example of bias is given in the following by Watts et al 😊 Potholes and Molehills: Bias in the Diagnostic Performance of Diffusion-Tensor Imaging in Concussion. Radiology 2014
-
[ ] In 3.1 Grid search Strictly, the dependence of the number of combinations on granularity of the grid search is not exponential
-
[x] 3.2 Regularization Example: L2 vs L1 L1 regularization now requires solver='saga' in LogisticRegression call for L1 penalty. This is probably a change in the default behavior of Scikit Learn
-
[x] 4. Build a Pipeline As with 3.1, there seem to be a lot of parameters that give perfect accuracy. Maybe classifying by blocks is too easy, and the number of blocks is relatively low, so big steps in accuracy
-
[ ] c_steps = [10e-1, 10e0, 10e1, 10e2] is confusing notation for exponents