bwa bwa index problem when indexing files with large size

bwa index problem when indexing files with large size

Open 544728460 opened this issue 4 years ago • 8 comments

I have downloaded the nt data base from NCBI , for which the size is about 90G, I want to create a index file with commands as follows: bwa index -a bwtsw nt.fasta

Although it is running without any problem but it seems that the iterations can not come to an end after several days running.

I think it may be the problem of the big size for nt.fasta file. So is there any other methods for me to index big files like nt.fasta?

Thanks!

Feb 05 '21 15:02 544728460

Hello,sorry to bother you. I have the same problem, have you solved it?

May 13 '22 07:05 MonaLiu421

Sorry, i haven't solved it yet and i just chose to split the large file into several small files before indexing.

May 16 '22 06:05 544728460

I‘m so glad to get your reply, I would like to ask is it reasonable to split the reference like this? Will it affect the result of mapping?

May 16 '22 06:05 MonaLiu421

Yes, it's definitely reasonable because splitting large files is not only what i did to solve this problem but NCBI did the same thing with their nt database. (For details, see https://ftp.ncbi.nlm.nih.gov/blast/db/)

May 19 '22 09:05 544728460

ok, thank you,you are so nice.

May 23 '22 02:05 MonaLiu421

You may try a larger -b value

bwa index -b 100000000

It will be a little faster, but indexing nt will take days anyway.

Jun 03 '22 18:06 lh3

Okay, thanks! I'll try later.

Jun 06 '22 02:06 544728460

I also encountered the same problem. As you suggested, I increased the original -b parameter by a factor of 10. Unfortunately, fna.pac still doesn't grow after reaching the same size as before. The log file does not record the occurrence of the error. I think this may be caused by the memory constraints of my compute nodes, and simply increasing the -b parameter may not help much.

Aug 01 '22 03:08 bogv127515

bwa bwa copied to clipboard

bwa index problem when indexing files with large size

bwa
bwa copied to clipboard