bwa icon indicating copy to clipboard operation
bwa copied to clipboard

bwa index problem when indexing files with large size

Open 544728460 opened this issue 4 years ago • 8 comments

I have downloaded the nt data base from NCBI , for which the size is about 90G, I want to create a index file with commands as follows: bwa index -a bwtsw nt.fasta

Although it is running without any problem but it seems that the iterations can not come to an end after several days running.

I think it may be the problem of the big size for nt.fasta file. So is there any other methods for me to index big files like nt.fasta?

Thanks!

1

544728460 avatar Feb 05 '21 15:02 544728460

Hello,sorry to bother you. I have the same problem, have you solved it?

MonaLiu421 avatar May 13 '22 07:05 MonaLiu421

Sorry, i haven't solved it yet and i just chose to split the large file into several small files before indexing.

544728460 avatar May 16 '22 06:05 544728460

I‘m so glad to get your reply, I would like to ask is it reasonable to split the reference like this? Will it affect the result of mapping?

MonaLiu421 avatar May 16 '22 06:05 MonaLiu421

Yes, it's definitely reasonable because splitting large files is not only what i did to solve this problem but NCBI did the same thing with their nt database. (For details, see https://ftp.ncbi.nlm.nih.gov/blast/db/)

544728460 avatar May 19 '22 09:05 544728460

ok, thank you,you are so nice.

MonaLiu421 avatar May 23 '22 02:05 MonaLiu421

You may try a larger -b value

bwa index -b 100000000

It will be a little faster, but indexing nt will take days anyway.

lh3 avatar Jun 03 '22 18:06 lh3

Okay, thanks! I'll try later.

544728460 avatar Jun 06 '22 02:06 544728460

I also encountered the same problem. As you suggested, I increased the original -b parameter by a factor of 10. Unfortunately, fna.pac still doesn't grow after reaching the same size as before. The log file does not record the occurrence of the error. I think this may be caused by the memory constraints of my compute nodes, and simply increasing the -b parameter may not help much.

bogv127515 avatar Aug 01 '22 03:08 bogv127515