GLUE
GLUE copied to clipboard
Why Does the `integration_consistency` require raw counts?
See https://github.com/gao-lab/GLUE/blob/e20518b0ae7b39d47244087b825511e5c84f9b7e/scglue/data.py#L609 for the line from which integration_consistency
makes this requirement. I am wondering why this is the case? I have floating point data I am trying to integrate.
Well this situation a bit awkward. The integration_consistency
involves computing feature-feature correlation. For the most common RNA & ATAC data, we need to preprocess the count data in a particular way (total count normalization + log transformation) to ensure that the correlations make sense.
For other data types it is not as clear what preprocessing is necessary. I guess the preprocessing part should ideally be modularized so users can also insert their own preprocessing function. I'll try to add that in the next release!
Sorry for the late response!
Starting from v0.3.0, scglue.models.integration_consistency
now works with non raw-count data. Nevertheless, you may need to specify appropriate "metacell" aggregation and preprocessing functions via arguments "agg_fns" and "prep_fns" (passed along to scglue.data.metacell_corr
).
By default, the function retains the previous behavior of raw-count normalization if the datasets were configured with "NB" or "ZINB" models.