EDTA
EDTA copied to clipboard
solve '*.mod.EDTA.TEanno.sum' empty
Hi shujun!
I have some genome mod.EDTA.TEanno.sum
files that are empty, so I checked part of the EDTA process. I found that buildSummary.pl
script at line 560 will terminate when reading the .mod.EDTA.TEanno.out
file if $type is missing. I checked the .mod.EDTA.TEanno.split.bed
file, and the repeat sequence missing the $type is marked as "snRNA". I don't know if there are other types of repeat sequences that may lack the $type tag. What about changing the "die" at line 560 of the buildSummary.pl
script to "next"? I am not sure if this will affect subsequent processes.
perl EDTA-master/util/buildSummary.pl -maxDiv 40 -stats $genome.mod.stats $genome.mod.EDTA.TEanno.out > $genome.mod.EDTA.TEanno.sum 2> out.log
out.log
This out line is the first instance of the change:
10000 0.001 0.001 0.001 scaffold398 121760 121980 NA + TE_00000280_INT LTR/unknown
missing type for TE_00002102 ... <>
```mod.EDTA.TEanno.out`
10000 0.001 0.001 0.001 Chr4 13998493 14000579 NA + TE_00001932_INT LTR/unknown
10000 0.001 0.001 0.001 Chr4 14000611 14000691 NA + TE_00002102
10000 0.001 0.001 0.001 Chr4 14000769 14001580 NA + TE_00001932_INT LTR/unknown
.mod.EDTA.TEanno.split.bed
Chr4 13997694 13998492 TE_00000112_INT LTR/Copia homology 0.709 4196 - . ID=TE_homo_82343;sequence_ontology=SO:0002264;ID=TE_homo_89176;sequence_ontology=SO:0002264
Chr4 13998493 14000579 TE_00001932_INT LTR/unknown homology 0.9 9217 + . ID=TE_homo_82342;sequence_ontology=SO:0000186;ID=TE_homo_89175;sequence_ontology=SO:0000186
Chr4 14000611 14000691 TE_00002102 snRNA homology 0.839 458 + . ID=TE_homo_82344;sequence_ontology=SO:0000274;ID=TE_homo_89177;sequence_ontology=SO:0000274