LTR_retriever
LTR_retriever copied to clipboard
LTR retriever is not compatible with RepeatModeler2 since v2.9.8?
Dear LTR retriever developers,
I was using RepeatModeler and found that there is no output from LTR retriever (v2.9.8 and v.2.9.9, either from GitHub or Conda). This seems to have been reported before by other users. The log file of v2.9.8 reports:
Thu Mar 28 23:13:07 CET 2024 Dependency checking: All passed! Thu Mar 28 23:13:16 CET 2024 LTR_retriever is starting from the Init step. Thu Mar 28 23:13:17 CET 2024 Start to convert inputs... Total candidates: 35905 Total uniq candidates: 35905
Thu Mar 28 23:13:22 CET 2024 Module 1: Start to clean up candidates... Sequences with 10 missing bp or 0.8 missing data rate will be discarded. Sequences containing tandem repeats will be discarded.
Usage: perl cleanup.pl -f sample.fa [options] > sample.cln.fa
Options:
-misschar n Define the letter representing unknown sequences; case insensitive; default: n
-Nscreen [0|1] Enable (1) or disable (0) the -nc parameter; default: 1
-nc [int] Ambuguous sequence len cutoff; discard the entire sequence if > this number; default: 0
-nr [0-1] Ambuguous sequence percentage cutoff; discard the entire sequence if > this number; default: 1
-minlen [int] Minimum sequence length filter after clean up; default: 100 (bp)
-cleanN [0|1] Retain (0) or remove (1) the -misschar taget in output sequence; default: 0
-trf [0|1] Enable (1) or disable (0) tandem repeat finder (trf); default: 1
-trf_path path Path to the trf program
Thu Mar 28 23:13:22 CET 2024 0 clean candidates remained
Out of curiosity, I downgraded LTR retriever to v2.9.5 from conda, and this time it passed Module 1:
Thu Apr 11 21:37:41 CEST 2024 Dependency checking: All passed! Thu Apr 11 21:37:43 CEST 2024 LTR_retriever is starting from the Init step. Thu Apr 11 21:37:45 CEST 2024 Start to convert inputs... Total candidates: 35905 Total uniq candidates: 35905
Thu Apr 11 21:37:49 CEST 2024 Module 1: Start to clean up candidates... Sequences with 10 missing bp or 0.8 missing data rate will be discarded. Sequences containing tandem repeats will be discarded.
Thu Apr 11 21:37:49 CEST 2024 35905 clean candidates remained
Thu Apr 11 21:37:49 CEST 2024 Modules 2-5: Start to analyze the structure of candidates... The terminal motif, TSD, boundary, orientation, age, and superfamily will be identified in this step.
It seems there is something wrong with get_range.pl from v2.9.8, which makes LTR_retriever not able to read LTR_harvest output. May I ask is there any suggestion?