mnnpy
mnnpy copied to clipboard
input data type
what kind of data can be used as mnnpy's input ? log(counts), log(CPM) or others?
log counts, regressed is better. The output should only be used for PCA/Cluster/TSNE, don't use it to do DE.
Does log(counts) means loge(counts), log2(counts) or log10(counts) by default ? And, 'regressed is better' means that I 'd better regress out some variables such as mitochondria gene percentage, total UMI, sex, age and so on ?
Excuse me, Can I use log(CPM) instead ? Because it is consistent with scanpy's standard pipeline [sc.pp.normalize_per_cell(adata, counts_per_cell_after=1e4) and then sc.pp.log1p(adata)], which eqauls to log(CPM) in essential.
No problem, as long as the data is in log space
On Jun 26, 2018, at 15:43, wangjiawen2013 [email protected] wrote:
Excuse me, Can I use log(CPM) instead ? Because it is consistent with scanpy's standard pipeline [sc.pp.normalize_per_cell(adata, counts_per_cell_after=1e4) and then sc.pp.log1p(adata)], which eqauls to log(CPM) in essential.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/chriscainx/mnnpy/issues/13#issuecomment-400212279, or mute the thread https://github.com/notifications/unsubscribe-auth/AH-UeI0lCq7-h_GRMPuKcPNUZ8vGk2SAks5uAeYcgaJpZM4Uxjbi.
No problem, as long as the data is in log space
On Jun 26, 2018, at 15:43, wangjiawen2013 <[email protected] mailto:[email protected]> wrote:
Excuse me, Can I use log(CPM) instead ? Because it is consistent with scanpy's standard pipeline [sc.pp.normalize_per_cell(adata, counts_per_cell_after=1e4) and then sc.pp.log1p(adata)], which eqauls to log(CPM) in essential.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/chriscainx/mnnpy/issues/13#issuecomment-400212279, or mute the thread https://github.com/notifications/unsubscribe-auth/AH-UeI0lCq7-h_GRMPuKcPNUZ8vGk2SAks5uAeYcgaJpZM4Uxjbi.