Imaging-transcriptomics
Imaging-transcriptomics copied to clipboard
Question about the MATLAB style correlation function in genes.py
https://github.com/alegiac95/Imaging-transcriptomics/blob/46adb1df85c37123c77226d023b6d763edae8aca/imaging_transcriptomics/genes.py#L116C1-L125C36
Hi, I am wondering if there is some thing wrong with the correlation here.
def correlate(c1, c2):
"""Return the MATLAB style correlation between two vectors."""
return np.corrcoef(np.hstack((c1, c2)), rowvar=False)[0, 1:]
_res = pls_regression(gene_exp, imaging_data.reshape(
imaging_data.shape[0], 1),
n_components=self.n_components,
n_boot=0, n_perm=0)
r1 = correlate(_res.get("x_scores"), scan_data.reshape(
scan_data.shape[0], 1))
In this code: "r1 = correlate(_res.get("x_scores"), scan_data.reshape(scan_data.shape[0], 1))
". The _res.get("x_scores")
is c1 which is the components from the PLS results, and the scan_data.reshape(scan_data.shape[0], 1))
is c2 which is a imaging data column. And then combined the component columns and the single imaging data column into a combined matrix using np.hstack
.
So the last column in this combined matrix here is imaging data and the rest columns are components. And then using np.corrcoef
with the rowvar=False
we can get a correlation matrix with the element [m, n] represents the correlation coefficient between the column m and column n.
As far as I understand, the r1
represents to the correlation coefficients between the imaging data and different PLS components. However, here in the combined matrix, the imaging data is the last column. And when the code slicing the correlation matrix with the [0, :]
, it returns the correlation coefficients between the first component (the first column from c1) and the rest of the components as well as the imaging data.
For example, if there are 2 components, the r1
here is [coeffience between component 1 and 2, coeffience between component 1 and imaging data ].
Is that correct?