bio_corex How to apply CorEx on diffusion MRI data

Hello, I would like to better understand how to use corEx to model my data. It is about 14 measures from diffusion MRI data. Each measure is 69 (number of subjects) by 2286 (number of voxels). I want to inspect correlations among these measures, which may be related to each other. I have read CorEx papers and looked at the python code, my specific questions are: • How X matrix has to be built in my case? • How the number of hidden factors to use can be chosen? • How dimension of each hidden factor can be chosen? • marginal_description I guess must be 'gaussian' since my data is continuous • smooth_marginals = True (turns on Bayesian smoothing)

Thank you in advance, Rosella

Feb 27 '24 10:02 rosella1234

69 rows by 2286 columns for the data matrix.
Start with 20-50. If you get lots of good clusters, you can try more.
2 or 3 is usually sufficient. Especially since you have only 69 subjects, maybe 2 is better
Yes, Gaussian should be best
And smooth_marginals will be good for small number of samples.

Feb 27 '24 23:02 gregversteeg

Hi Thank you very much for your quick response! Everything is clear, just a question: I would like to inspect correlations among all the 14 measures altogether and not within the single measure, how should I concatenate my 14 [69 by 2286[ matrices to create X to feed into CorEx? thanks in advance Rosella

Feb 28 '24 11:02 rosella1234

Good afternoon, I have tried to run CorEx on each subject at a time, using as input a 14*2286 matrix for each subject, that is all the 14 diffusion measures stacked together and flattened along their 2286 voxels. I was suggested to use: number of hidden factos = 10 and dimension of each hidden factor = 1 to get 1 joint representation per subject. However, when running CorEx this way (see code below) I get all total correlations and all clusters equal to 0. Am I doing something wrong in my implementation? Thanks a lot in advance,

Rosella

# Set bioCorEx parameters
num_hidden_factors = 10
dim_hidden_factor = 1
marginal_description = 'gaussian'
smooth_marginals = True
# Initialize an empty NumPy array to store the output
output_matrix = np.empty((num_subjects, num_hidden_factors))
for i in range(1, num_subjects):
    # Initialize bioCorEx object
    corex = ce.Corex(n_hidden=num_hidden_factors, dim_hidden=dim_hidden_factor, marginal_description =marginal_description, smooth_marginals=smooth_marginals)
    # Fit bioCorEx model
    corex.fit(data_matrix[i,:,:])
    print(corex.tcs)
    print(corex.clusters)
    output_matrix[i, :] = corex.tcs
np.save('output_matrix.npy', output_matrix)

May 17 '24 15:05 rosella1234