neuroCombat Cross validated ComBat

Thank you so much for this Python implementation! :)

I want to try out ComBat for my machine learning classification study, in which I try to separate patients from controls using cortical features of >4000 subjects, coming from 46 unique sites around the world. My model optimization, training and testing are performed in separate (inner- and outer-) cross-validation loops. The problem is that ComBat seems to be a “one-shot” approach; in the sense that it is run only once on the entire data set instead of estimating the ComBat model’s parameters using the training data only, and applying the estimated parameters on both training and test data.

I tried to implement a cross-validated ComBat approach myself, but haven't been able to figure out how yet. Do you have any recommendations for this? Or can you recommend any other methods on how to get rid of site-specific effects in multisite data sets in a cross-validated manner?

Jul 10 '18 12:07 WillemB2104

yes I get what you're trying to do but that isn't supported right now, sorry. the code can be modified to return the relevant transformation parameters after each inner CV loop and then there can be another function to apply those parameters to the held-out data..

Jul 30 '18 10:07 ncullen93

@WillemB2104 I've got some hacky Rpy2 code that adds test parameters that the originally learned parameters are applied to.

Be careful with it though, in my experience, ComBat seems to have actually induced site effects in the test data because the site effects in the training data don't exactly match the site effects in the test data.

Oct 31 '18 18:10 Shotgunosine

Hi @WillemB2104 ,

Could you please share with us what you did to apply Combat in a cross-validation?

Cheers, Walter

Sep 03 '19 09:09 Warvito