CONICS icon indicating copy to clipboard operation
CONICS copied to clipboard

getGenePositions parameter got the wrong location information

Open sunshine1126 opened this issue 3 years ago • 3 comments

Hello,the location information may be wrong when I used the getGenePositions parameter to obtain the chromosomal positions of genes in the expression matrix, see as follow。

library(beanplot) library(mixtools) library(pheatmap) library(zoo) library(squash) library(biomaRt) library(CONICSmat) tt = getGenePositions(gene_names=c("ABCF1","ABHD16A","AGER","AGPAT1","AIF1","APOM")) tt

image image

sunshine1126 avatar Aug 24 '21 08:08 sunshine1126

Hey,

thanks for reaching out. These are genes for which alternative positions have been reported (see attached). For those (~3% of ENSEBL genes), CONICSmat will set the chromosome to 0 in order to avoid using potentially incorrect genomic loci for CNV inference.

Screen Shot 2021-08-24 at 9 54 17 AM

soerenmueller avatar Aug 24 '21 16:08 soerenmueller

@soerenmueller Thanks for your reply. In humans, each cell normally contains 23 pairs of chromosomes, for a total of 46. Twenty-two of these pairs, called autosomes, look the same in both males and females. The 23rd pair, the sex chromosomes, differ between males and females. So, I want to ask about the meaning of the chromosome "23" and "24". image

sunshine1126 avatar Aug 25 '21 02:08 sunshine1126

@sunshine1126 Hi, You can read the source code and understand the meaning of chr 0, chr 23, chr 24 by reading the source code here GetPositions.R

gene_positions[which(gene_positions[,3]=="X"),3]=23
gene_positions[which(gene_positions[,3]=="Y"),3]=24
gene_positions[which(gene_positions[,3]=="MT"),3]=0
gene_positions[which(nchar(gene_positions[,3])>2),3]=0

The third column of the gene_positions_dataframe is the chromosome name. The code suggests that the chromosome X is replaced with number 23; the chromosome Y is replaced with number 24; the MT is replaced with number 0; the chromosome with weird long name is also replaced with number 0, as mentioned above by soerenmueller

These are genes for which alternative positions have been reported (see attached). For those (~3% of ENSEBL genes); CONICSmat will set the chromosome to 0 in order to avoid using potentially incorrect genomic loci for CNV inference.

sciencepeak avatar Feb 24 '23 20:02 sciencepeak