brainiak-tutorials icon indicating copy to clipboard operation
brainiak-tutorials copied to clipboard

05-Optimzation: Enhancements and fixes

Open manojneuro opened this issue 4 years ago • 0 comments

The following items can be improved in this notebook:

  • [ ] The early sections "Recap" and "Dataset" are almost identical, so redundant

  • [ ] Exercise 1 Presumably the expectation is to separate the train/test sets for the classifier and also for the voxel selection. It might be worth emphasizing that using all the data for voxel selection is a common but subtle error. There are probably quite a few good examples in the literature that got past less technical reviewers In this example, I consistently get slightly below chance performance. I believe that this is driven by the cross-validation, see: Classification based hypothesis testing in neuroscience: Below‐chance level classification rates and overlooked statistical properties of linear parametric classifiers. HBM 2016 Another subtle example of bias is given in the following by Watts et al 😊 Potholes and Molehills: Bias in the Diagnostic Performance of Diffusion-Tensor Imaging in Concussion. Radiology 2014

  • [ ] In 3.1 Grid search Strictly, the dependence of the number of combinations on granularity of the grid search is not exponential

  • [x] 3.2 Regularization Example: L2 vs L1 L1 regularization now requires solver='saga' in LogisticRegression call for L1 penalty. This is probably a change in the default behavior of Scikit Learn

  • [x] 4. Build a Pipeline As with 3.1, there seem to be a lot of parameters that give perfect accuracy. Maybe classifying by blocks is too easy, and the number of blocks is relatively low, so big steps in accuracy

  • [ ] c_steps = [10e-1, 10e0, 10e1, 10e2] is confusing notation for exponents

manojneuro avatar Jun 16 '20 03:06 manojneuro