Bismark
Bismark copied to clipboard
Reverse Sequenced Reads
Hello!
Recently I sequenced some methylation libraries, single end mode, in the reverse direction (from the i7 side). Is it correct to input the genome.fasta file as is and run the pipeline in non-directional mode or Is it correct to input the reverse complement of the genome.fasta file and run the pipeline in directional mode?
Both runs result in similar mapping efficiency but I wonder if the wrong C>T and G>A conversions are happening for the negative strand when the CX report is being calculated? Here is the code I use post trimming and fastqc:
bismark_genome_preparation /path/to/file/ --verbose bismark --non_directional --genome /path/to/file/ *_trimmed_trimmed.fq bismark_methylation_extractor --bedGraph --CX --counts --cytosine_report --genome_folder /path/to/file/ *bt2.bam
Any suggestions would be greatly appreciated. Cheers.
Generally, it is fine to just use the genome.fa file as the input, there is no need to reverse-complement anything. My guess would be that it would work either way though.
Im am not sure if you reallym require non-directional mode, do you get similar alignments to all 4 strands? If most alignment are against the OT/OB strands the libraries were directional (default mode), or if CTOT/CTOB strands had been dominating you might be looking at a non-directional library. If you could attach the FastQC profiles I could probably judge that a little better.
Hello,
Great, thanks a lot for the feedback, I just waned some confirmation on what seems most logical, as I wasn't sure the algorithm would automatically realize the reverse sequencing. To summarize, there is no big difference when I run in directional Vs non-directional mode. As you guessed OT/OB strand alignment dominates in both modes with some 150 reads aligning to CTOT/CTOB strands when run in non-directional mode. Please find fastq file attached and let me know if there is anything additional I should consider.
Thanks a lot once again, this has been super :)
On Mon, 14 Nov 2022 at 00:15, Felix Krueger @.***> wrote:
Generally, it is fine to just use the genome.fa file as the input, there is no need to reverse-complement anything. My guess would be that it would work either way though.
Im am not sure if you reallym require non-directional mode, do you get similar alignments to all 4 strands? If most alignment are against the OT/OB strands the libraries were directional (default mode), or if CTOT/CTOB strands had been dominating you might be looking at a non-directional library. If you could attach the FastQC profiles I could probably judge that a little better.
— Reply to this email directly, view it on GitHub https://github.com/FelixKrueger/Bismark/issues/545#issuecomment-1312849327, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4EETU5GSVBQOYKX5E3ZUO3WIFZAHANCNFSM6AAAAAAR5PBFAE . You are receiving this because you authored the thread.Message ID: @.***>
If the data is directional, non-directional alignments tpycially don't cause much of an issue, but since it is computationally much more intense I would recommend you simply drop --non_directional
.
Duly noted, will definitely drop the --non_directional. Thanks a lot!
On Tue, 15 Nov 2022 at 20:15, Felix Krueger @.***> wrote:
If the data is directional, non-directional alignments tpycially don't cause much of an issue, but since it is computationally much more intense I would recommend you simply drop --non_directional.
— Reply to this email directly, view it on GitHub https://github.com/FelixKrueger/Bismark/issues/545#issuecomment-1315756125, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4EETU67772ZWB2V7RAF2ALWIPONBANCNFSM6AAAAAAR5PBFAE . You are receiving this because you authored the thread.Message ID: @.***>