RepeatMasker icon indicating copy to clipboard operation
RepeatMasker copied to clipboard

"compressed subject database does not exist" error with custom library

Open jebrosen opened this issue 2 years ago • 2 comments

I met the same problem when I used the custom library (about 4.3Gb). RepeatMasker version 4.1.2-p1 Search Engine: NCBI/RMBLAST [ 2.10.0+ ] NCBIBlastSearchEngine::search: Error...compressed subject database (/public1/BGI/Project/F21FTSECKF2595/PUTjinwD/02.assembly/02.hic/02.contain_hic/cc/RM_23789.FriNov120937482021/merge.TEsorter.non_similar_repeat.for_lib.fa) does not exist! Is it the library too big? It's going on while I split the library into 2 files (about 2Gb) It's not suitable to mask repeat using two splited library.

Originally posted by @answer19831020 in https://github.com/rmhubley/RepeatMasker/issues/8#issuecomment-966752900

jebrosen avatar Nov 15 '21 19:11 jebrosen

@answer19831020 In this situation, there should be a file in the temporary directory for the run (RM_...<date>/) named makeblastdb.log. Does the makeblastdb.log file exist and have further information about the error?

jebrosen avatar Nov 15 '21 19:11 jebrosen

Hi there. I am having the same error. Here's further information about the error from makeblastdb.log

Building a new DB, current time: 08/09/2022 16:42:37 New DB name: /nobackup/fbssabd/RMask/RepeatMasker/RM_39078.TueAug91642352022/sibirica-combine$ New DB title: sibirica-combined.fa Sequence type: Nucleotide Keep MBits: T Maximum file size: 1000000000B

No volumes were created.

Error: mdb_env_open: Cannot allocate memory

I was running this interactively on the HPC at my institution. I will try to re-run the command via the script file and submit it to the cluster. Do you have any suggestion on how much resources should be allocated for this run?

Shairah avatar Aug 09 '22 15:08 Shairah