MuSiC icon indicating copy to clipboard operation
MuSiC copied to clipboard

group.marker selection criteria

Open davisidarta opened this issue 5 years ago • 7 comments

Hello Xuran! Thanks for your great package!

I've been able to run MuSiC cell type estimation analysis on my data of interes (brain). However, several cell types are very transcriptionally closely related to one another, yet with significative functional distinction. Because of this, I want to run music_prop.clusters on my data, in order to obtain more reliable results.

However, as noted by #15 , how to select for differentially expressed genes among these groups using the output of music_basis is not explained neither in the vignettes nor in the paper itself. So how do one properly builds its own group.marker list from music_basis output?

From your experience as the package creator, what cutoff should be used to select genes from the design matrix, as an example?

davisidarta avatar May 10 '19 22:05 davisidarta

Hi davisidarta,

Thanks for using MuSiC and sorry for not reply in time. From music_basis output, my first choice is to form a heirarchial tree from Design matrix. From the Design matrix tree, we will see some cell types are rooted from the same branch. The cut-off are user determined. Take the kidney data as an example,https://xuranw.github.io/MuSiC/articles/MuSiC.html#estimation-of-cell-type-proportions-with-pre-grouping-of-cell-types We have multiple choice for cutting the tree. But from observation and our pre-knowledges, Fib, Macro, NK, B lymph, T lymph are immune cell type or immume-like cell types. Therefore, we want to put them as a cluster. Similarly, we know that Endo, CD-PC, CD-IC, LOH, DCT, PT are all kidney related cells and they are put as a cluster.

Hope this helps!

Best, Xuran

xuranw avatar Jul 09 '19 21:07 xuranw

Hi Xuran, I have a similar question to the above. For my data, I was able to generate a hierarchical tree using the code you provided, and it somewhat matched the groups I was hoping they would form. However, my question is how to select, from music output, the marker genes that are best for giving the differentially expressed genes within the group? I already have been able to create my groups of cell types, but now I am curious if there is an agreed-upon method for selecting the differentially expressed genes that would be used in the second round of MuSiC to determine within-group proportions? Thanks, T.J.

tjbutler003 avatar Sep 05 '19 19:09 tjbutler003

Hello Xuran! Thanks for your great package!

I've been able to run MuSiC cell type estimation analysis on my data of interes (brain). However, several cell types are very transcriptionally closely related to one another, yet with significative functional distinction. Because of this, I want to run music_prop.clusters on my data, in order to obtain more reliable results.

However, as noted by #15 , how to select for differentially expressed genes among these groups using the output of music_basis is not explained neither in the vignettes nor in the paper itself. So how do one properly builds its own group.marker list from music_basis output?

From your experience as the package creator, what cutoff should be used to select genes from the design matrix, as an example?

Hi Xuran, I have a similar question to the above. For my data, I was able to generate a hierarchical tree using the code you provided, and it somewhat matched the groups I was hoping they would form. However, my question is how to select, from music output, the marker genes that are best for giving the differentially expressed genes within the group? I already have been able to create my groups of cell types, but now I am curious if there is an agreed-upon method for selecting the differentially expressed genes that would be used in the second round of MuSiC to determine within-group proportions? Thanks, T.J.

Hi davisidarta,

Thanks for using MuSiC and sorry for not reply in time. From music_basis output, my first choice is to form a heirarchial tree from Design matrix. From the Design matrix tree, we will see some cell types are rooted from the same branch. The cut-off are user determined. Take the kidney data as an example,https://xuranw.github.io/MuSiC/articles/MuSiC.html#estimation-of-cell-type-proportions-with-pre-grouping-of-cell-types We have multiple choice for cutting the tree. But from observation and our pre-knowledges, Fib, Macro, NK, B lymph, T lymph are immune cell type or immume-like cell types. Therefore, we want to put them as a cluster. Similarly, we know that Endo, CD-PC, CD-IC, LOH, DCT, PT are all kidney related cells and they are put as a cluster.

Hope this helps!

Best, Xuran

Were either of you able to figure out how to select the differentially expressed genes, and if so can you please share the code. It seems like multiple users have been having this issue so if @xuranw can please respond, that would be great. Thank you

satvirsaggi avatar Sep 17 '20 06:09 satvirsaggi

Were either of you able to figure out how to generate group.marker file from music_basis output? @xuranw could you please provide some guidance on this? Thanks

sanikapatki avatar Oct 21 '20 18:10 sanikapatki

Hello Xuran! Thanks for your great package! I've been able to run MuSiC cell type estimation analysis on my data of interes (brain). However, several cell types are very transcriptionally closely related to one another, yet with significative functional distinction. Because of this, I want to run music_prop.clusters on my data, in order to obtain more reliable results. However, as noted by #15 , how to select for differentially expressed genes among these groups using the output of music_basis is not explained neither in the vignettes nor in the paper itself. So how do one properly builds its own group.marker list from music_basis output? From your experience as the package creator, what cutoff should be used to select genes from the design matrix, as an example?

Hi Xuran, I have a similar question to the above. For my data, I was able to generate a hierarchical tree using the code you provided, and it somewhat matched the groups I was hoping they would form. However, my question is how to select, from music output, the marker genes that are best for giving the differentially expressed genes within the group? I already have been able to create my groups of cell types, but now I am curious if there is an agreed-upon method for selecting the differentially expressed genes that would be used in the second round of MuSiC to determine within-group proportions? Thanks, T.J.

Hi davisidarta, Thanks for using MuSiC and sorry for not reply in time. From music_basis output, my first choice is to form a heirarchial tree from Design matrix. From the Design matrix tree, we will see some cell types are rooted from the same branch. The cut-off are user determined. Take the kidney data as an example,https://xuranw.github.io/MuSiC/articles/MuSiC.html#estimation-of-cell-type-proportions-with-pre-grouping-of-cell-types We have multiple choice for cutting the tree. But from observation and our pre-knowledges, Fib, Macro, NK, B lymph, T lymph are immune cell type or immume-like cell types. Therefore, we want to put them as a cluster. Similarly, we know that Endo, CD-PC, CD-IC, LOH, DCT, PT are all kidney related cells and they are put as a cluster. Hope this helps! Best, Xuran

Were either of you able to figure out how to select the differentially expressed genes, and if so can you please share the code. It seems like multiple users have been having this issue so if @xuranw can please respond, that would be great. Thank you

@satvirsaggi Were you able to figure this out? Thanks.

sanikapatki avatar Oct 21 '20 18:10 sanikapatki

Hello Xuran! Thanks for your great package! I've been able to run MuSiC cell type estimation analysis on my data of interes (brain). However, several cell types are very transcriptionally closely related to one another, yet with significative functional distinction. Because of this, I want to run music_prop.clusters on my data, in order to obtain more reliable results. However, as noted by #15 , how to select for differentially expressed genes among these groups using the output of music_basis is not explained neither in the vignettes nor in the paper itself. So how do one properly builds its own group.marker list from music_basis output? From your experience as the package creator, what cutoff should be used to select genes from the design matrix, as an example?

Hi Xuran, I have a similar question to the above. For my data, I was able to generate a hierarchical tree using the code you provided, and it somewhat matched the groups I was hoping they would form. However, my question is how to select, from music output, the marker genes that are best for giving the differentially expressed genes within the group? I already have been able to create my groups of cell types, but now I am curious if there is an agreed-upon method for selecting the differentially expressed genes that would be used in the second round of MuSiC to determine within-group proportions? Thanks, T.J.

Hi davisidarta, Thanks for using MuSiC and sorry for not reply in time. From music_basis output, my first choice is to form a heirarchial tree from Design matrix. From the Design matrix tree, we will see some cell types are rooted from the same branch. The cut-off are user determined. Take the kidney data as an example,https://xuranw.github.io/MuSiC/articles/MuSiC.html#estimation-of-cell-type-proportions-with-pre-grouping-of-cell-types We have multiple choice for cutting the tree. But from observation and our pre-knowledges, Fib, Macro, NK, B lymph, T lymph are immune cell type or immume-like cell types. Therefore, we want to put them as a cluster. Similarly, we know that Endo, CD-PC, CD-IC, LOH, DCT, PT are all kidney related cells and they are put as a cluster. Hope this helps! Best, Xuran

Were either of you able to figure out how to select the differentially expressed genes, and if so can you please share the code. It seems like multiple users have been having this issue so if @xuranw can please respond, that would be great. Thank you

@satvirsaggi Were you able to figure this out? Thanks.

Hi, do you solve the problem?

kangxige avatar Feb 21 '21 14:02 kangxige

I did this and it gave reasonable results:

metadata = Biobase::pData(Mousesub.eset)
counts = Biobase::exprs(Mousesub.eset)
library(Seurat)
library(patchwork)
mouse_seurat = Seurat::CreateSeuratObject(counts,meta.data = metadata)
[email protected]$percent.mt= PercentageFeatureSet(mouse_seurat, pattern = "^MT-")

mouse_seurat = subset(mouse_seurat, subset = nFeature_RNA > 200  & percent.mt < 5)
mouse_seurat = Seurat::NormalizeData(mouse_seurat)
Idents(mouse_seurat) = "clusterType"
diffExpr = Seurat::FindMarkers(mouse_seurat, ident.1 = "C3", ident.2 = "C4", min.pct = 0.25, test.use = "wilcox",verbose = F)

Epith.marker = diffExpr[diffExpr$p_val_adj<0.01 & diffExpr$avg_log2FC>0,]
Immune.marker = diffExpr[diffExpr$p_val_adj<0.01 & diffExpr$avg_log2FC<0,]

I hope it helps.

mperalc avatar Sep 24 '21 18:09 mperalc