Bismark
Bismark copied to clipboard
An error was reported in methylation extraction
Hello Felix
I encountered two errors after running step 3:
Sorting input file CpG_context_wwz-guo2_R1_bismark_bt2_pe.txt.chrChr01.methXtractor.temp by positions (using -S of 1500G) Died at /data01/masterHome/houtong/bismark/bismark2bedGraph line 487. Finished BedGraph conversion ...
and
No last chromosome was defined, something must have gone wrong while reading the data in (e.g. specified wrong file path for a gzipped coverage file?). Please check your command!
I tried a lot of things but I couldn't figure it out.
This is the command I used: nohup bismark_methylation_extractor -p --parallel 40 --comprehensive --no_overlap --bedGraph --cytosine_report --CX_context --split_by_chromosome --counts --buffer_size 1500G --report --samtools_path /data01/masterHome/bshoutong/samtools/samtools --genome_folder /data01/masterHome/bshoutong/jiajihua/Sch.index wwz-guo2_R1_bismark_bt2_pe.bam
OK let's try to dissect this:
Which type of genome are you using? is it chromosome or scaffold based? If the latter is the case, I would recommend you look at the option --scaffolds
.
A rather minimal command could be:
bismark_methylation_extractor --bedGraph wwz-guo2_R1_bismark_bt2_pe.bam
-p
: this will be auto-detected. drop it
--parallel 40
: wow, that's a lot. How about starting with 4?
--no_overlap
: is the default for paired-end data. drop it
--cytosine_report
: can be added once the process is working
--CX_context
: this will make the process very slow. Are you working in a plant species?
--counts
: is the default. drop it
--buffer_size 1500G
: 1.5TB RAM? I doubt you will need more than a few GB
--report
: is the default. drop it
Thank you very much for your reply
I analyzed the plant genome, which has 14 major chromosomes and over 2,000 scaffolds at the level. I tried the --scaffolds parameter later, but came up with the same error. Also, the process file is really large, and I set it to 1500Gb to be on the safe side.
Hmm. Can you say until which step it worked? Did the methylation extraction work until completion? You should see the CpG, CHH and CHG context files, as well as the M-bias.txt file and a splitting report.
Has the bedGraph/coverage generation worked to completion? Do you see a .cov.gz file, and how many lines does it have?
(gunzip -c - | wc -l
). If this step has failed, you could start bismark2bedGraph
using the C* context files as input. Something like:
bismark2bedGraph --gzip --CX -o outputfile.cov.gz --scaffolds C*
Or has it failed at the coverage2cytosine
step?
Could to link the relevant text you see on screen when it is failing?