Specify additional "genomes" of spike-in controls for downstream calculation of bisulfite conversion efficiency
Description of feature
Library preparation kits come with unmethylated and methylated DNA controls that are added to samples (e.g. NEBNext EM-seq). It would be useful to supply methylseq with the fasta sequences of the controls and have it perform additional alignment, deduplication, and extraction using those references. This would help enable calculation of bisulfite/enzymatic conversion efficiency.
Thank you for providing this excellent pipeline!
At the moment, we're basically including them as part of the main reference genome FASTA and analyzing outside of methylseq. Having a way to include them separately would also be useful so additional spike-in specific analyses could be performed.
Hello everybody,
Since I also need this feature, I am willing to develop it as part of the methylseq pipeline. Please see this conversation in slack.
I'm new to methylseq analysis. One of the scientists in our lab need to run the analysis against the positive and negative control sequences (pUC19 and Lambda). I think this is the feature you're trying to add. What's the current status and how can I contribute?
Hello @trum994, I have worked on it before Christmas and have had some progress. It needs some more work though and I will resume it this week (busy with other things in the meantime). I am also just learning NextFlow , so I am not the fastest at implementing it.
Hi, just wondering if there is any available tools for analyzing pUC19 and lambda DNA?
What I did was added the sequences to the reference genome and then used the --fasta option for input. You can find the sequences here: https://neb-em-seq-sra.s3.amazonaws.com/grch38_core%2Bbs_controls.fa
In the cases where there are controls added, we've performed what @mpiersonsmela but did a post-processing round to extract reads aligned to the control sequences, then evaluated methylation conversion on each control using Bismark's methylation_consistency. We also perform this during standard analyses using methylKit.