modkit icon indicating copy to clipboard operation
modkit copied to clipboard

How to annotate bedmethyl file with cpg name instead of chr name and position and How can I find the pvalue for association of the cpg with the phenotype?

Open ralanany opened this issue 1 year ago • 4 comments

ralanany avatar Dec 15 '24 10:12 ralanany

Hello @ralanany,

Are you looking to join the chrom name, start, stop coordinates with a table of CpG "names"? Modkit doesn't have that exact functionality, but most dataframe and spreadsheet programs should be able to handle it. We're actively working on "phenotype" scores, but I don't have a solution for you right now.

ArtRand avatar Dec 19 '24 22:12 ArtRand

Hi @ArtRand Thanks for your reply, Actually, I don't have table with the cpg name, What I am looking for is how to find the cpg name for each position (cg000..)? I have the bedmethyl file with chr no, chr position and modification. But the cpg name not mentioned. I want help to find the cpg name as below

cg13869341 chr1 [15865, 15865] * cg14008030 chr1 [18827, 18827] * cg12045430 chr1 [29407, 29407] *

thanks in advance

ralanany avatar Dec 26 '24 11:12 ralanany

Hi @ArtRand Thanks for your reply, Actually, I don't have table with the cpg name, What I am looking for is how to find the cpg name for each position (cg000..)? I have the bedmethyl file with chr no, chr position and modification. But the cpg name not mentioned. I want help to find the cpg name as below

cg13869341 chr1 [15865, 15865] * cg14008030 chr1 [18827, 18827] * cg12045430 chr1 [29407, 29407] *

thanks in advance

Hi @ArtRand Thanks for your reply, Actually, I don't have table with the cpg name, What I am looking for is how to find the cpg name for each position (cg000..)? I have the bedmethyl file with chr no, chr position and modification. But the cpg name not mentioned. I want help to find the cpg name as below

cg13869341 chr1 [15865, 15865] * cg14008030 chr1 [18827, 18827] * cg12045430 chr1 [29407, 29407] *

thanks in advance

ralanany avatar Dec 26 '24 11:12 ralanany

Hello @ralanany,

From a quick search on your CpG names, it looks like these identifiers come from an Illumina probeset. If this is the case, I would download these tables and transform them into a BED6 file (chrom, start, stop, name, score, strand, you can make score=0 for all of the records). Then use this file and perform a bedtools intersect -a $bedmethyl -b $probes_bed -wb. Your question makes me think it would be convenient if when using --include-bed in modkit pileup if the name (when present) was carried along in the bedMethyl output. I'll consider adding this enhancement.

ArtRand avatar Dec 30 '24 18:12 ArtRand