methylpy icon indicating copy to clipboard operation
methylpy copied to clipboard

Methylpy in plant

Open zzh4399 opened this issue 2 years ago • 7 comments

Dear yupenghe, I would like to know whether methylpy can be used for methylation analysis of plant genome because the results I get using methylpy are quite different from those of Bismark. Thanks.

zzh4399 avatar Mar 21 '22 12:03 zzh4399

Yes, methylpy works on plant genome. Do you mind to describe the difference you referred to?

yupenghe avatar Mar 22 '22 01:03 yupenghe

I am very glad to receive your reply. Methylpy calculates a lower rate of methylation for the three types than Bismark (about half as much), and I tried to modify the comparison parameters, but it didn't seem to work.

zzh4399 avatar Mar 22 '22 03:03 zzh4399

That is interesting. It would be helpful to have some cases (e.g. methylated and unmethylated counts of a few Cs from methylpy and bismark). Also, is the library typical directional bisulfite sequencing library? is it pbat?

yupenghe avatar Mar 22 '22 03:03 yupenghe

It is really interesting. We found that methylation rates of CpG, CHG and CHH types calculated with Bismark are 40%, 20% and 2% respectively, while those calculated with methylpy are 20%, 10% and 0.5% respectively. We randomly found a single site and found that the methylation rate and the number of reads covering the site were different between the two software.

Methylpy seems to be more rigorous in determining the methylation of individual sites, which may be the reason for its lower methylation rate. For example, we found that methylpy calculated methylation rate of 0.8 and bismark calculated methylation rate of 1 for the same site.

zzh4399 avatar Mar 22 '22 04:03 zzh4399

That is interesting. What are the parameters you used to run bismark and methylpy? I did some comparison a while back and the results from these two methods are very close. I am wondering if any specific setting is used.

yupenghe avatar Mar 23 '22 03:03 yupenghe

Methylpy has a significant advantage in running speed and is easy to understand. In addition to comparing our own sequencing files, we also used methylpy to analyze the documented data (all from plants: Oryza sativa). We found that when methylpy was used in plant genomes, the methylation rate was significantly reduced, about half. These are the parameters of the two software we use:


methylpy paired-end-pipeline --read1-files M1-D_FDLM220001805-1a_1.clean.fq.gz --read2-files M1-D_FDLM220001805-1a_2.clean.fq.gz --forward-ref ~/bq/methy/db/rice_f --reverse-ref ~/bq/methy/db/rice_r --ref-fasta ~/bq/methy/db/IRGSP-1.0_genome.fasta --path-to-output zzh --num-procs 40 --sample M1_D


bismark --bowtie2 -N 0 -L 20 --quiet --un --ambiguous --sam -o output ~/bq/methy/db/ -1 M1-D_FDLM220001805-1a_1.clean.fq.gz -2 M1-D_FDLM220001805-1a_2.clean.fq.gz #sequences alignment

deduplicate_bismark M1-D_FDLM220001805-1a_1.clean_bismark_bt2_pe.bam # Dropping deduplication

bismark_methylation_extractor --no_overlap --paired-end --bedGraph --comprehensive --counts --remove_spaces --cytosine_report --genome_folder ~/bq/methy/db/ --buffer_size 10G --CX ../output/M1-D_FDLM220001805-1a_1.clean_bismark_bt2_pe.deduplicated.bam # Calling methylation

zzh4399 avatar Mar 23 '22 03:03 zzh4399

Nothing looks outstanding. If you manually check a few CG/CHH/CHG sites, what are the counts of reads and methylated reads you got from methylpy and bismark? For me to understand this, I would need some example data to reproduce the difference you found.

yupenghe avatar Mar 23 '22 05:03 yupenghe