salmon icon indicating copy to clipboard operation
salmon copied to clipboard

salmon index was invoked improperly - bad_alloc

Open ucabuk opened this issue 2 years ago • 0 comments

Is the bug primarily related to salmon (bulk mode) or alevin (single-cell mode)? Salmon index Describe the bug Hi, I have tried a big reference file before and It was succesfully created. Now, I am getting the following error. My index file is relatively big, ~2G. I could not solve the problem. I also increased CPU to 36.

Log:

[2023-03-15 20:10:48.957] [jLog] [warning] The salmon index is being built without any decoy sequences.  It is recommended that decoy sequence (either computed auxiliary decoy sequence or the genome of the organism) be provided during indexing. Further details can be found at https://salmon.readthedocs.io/en/latest/salmon.html#preparing-transcriptome-indices-mapping-based-mode.
[2023-03-15 20:10:48.968] [jLog] [info] building index
out : illerney.index
[2023-03-15 20:10:48.990] [puff::index::jointLog] [info] Running fixFasta

[Step 1 of 4] : counting k-mers
^M^Mcounted k-mers for 10000 transcripts^M^Mcounted k-mers for 20000 transcripts^M^Mcounted k-mers for 30000 transcripts^M^Mcounted k-mers for 40000 transcripts^M^Mcounted k-mers for 50000 transcripts^M^Mcounted k-mers for 60000 transcripts^M^Mcounted k-mers for 70000 transcripts^M^Mcounted k-mers for 80000 transcripts^M^Mcounted k-mers for 90000 transcripts^M^Mcounted k-mers for 100000 transc
[2023-03-15 20:12:01.773] [puff::index::jointLog] [info] Replaced 0 non-ATCG nucleotides
[2023-03-15 20:12:01.773] [puff::index::jointLog] [info] Clipped poly-A tails from 28 transcripts
wrote 4224924 cleaned references
[2023-03-15 20:12:12.984] [puff::index::jointLog] [info] Filter size not provided; estimating from number of distinct k-mers
[2023-03-15 20:12:29.921] [puff::index::jointLog] [info] ntHll estimated 1872745301 distinct k-mers, setting filter size to 2^35
Threads = 2
Vertex length = 31
Hash functions = 5
Filter size = 34359738368
Capacity = 2
Files:
illerney.index/ref_k31_fixed.fa
--------------------------------------------------------------------------------
Round 0, 0:34359738368
Pass    Filling Filtering
1       297     695
2       55      3
True junctions count = 5239944
False junctions count = 16749742
Hash table size = 21989686
Candidate marks count = 29916168
--------------------------------------------------------------------------------
Reallocating bifurcations time: 2
True marks count: 20234145
Edges construction time: 59
--------------------------------------------------------------------------------
Distinct junctions = 5239944

TwoPaCo::buildGraphMain:: allocated with scalable_malloc; freeing.
TwoPaCo::buildGraphMain:: Calling scalable_allocation_command(TBBMALLOC_CLEAN_ALL_BUFFERS, 0);
allowedIn: 139
Max Junction ID: 12729038
seen.size():101832313 kmerInfo.size():12729039
approximateContigTotalLength: 1607258836
counters for complex kmers:
(prec>1 & succ>1)=133010 | (succ>1 & isStart)=7442 | (prec>1 & isEnd)=7516 | (isStart & isEnd)=2442
contig count: 11353512 element count: 2210067304 complex nodes: 150410
# of ones in rank vector: 11353511
[2023-03-15 20:35:10.185] [puff::index::jointLog] [info] Starting the Pufferfish indexing by reading the GFA binary file.
[2023-03-15 20:35:10.185] [puff::index::jointLog] [info] Setting the index/BinaryGfa directory illerney.index
size = 2210067304
-----------------------------------------
| Loading contigs | Time = 451.61 ms
-----------------------------------------
size = 2210067304
-----------------------------------------
| Loading contig boundaries | Time = 180.73 ms
-----------------------------------------
Number of ones: 11353511
Number of ones per inventory item: 512
Inventory entries filled: 22175
11353511
[2023-03-15 20:35:13.921] [puff::index::jointLog] [info] Done wrapping the rank vector with a rank9sel structure.
[2023-03-15 20:35:13.997] [puff::index::jointLog] [info] contig count for validation: 11,353,511
[2023-03-15 20:35:19.728] [puff::index::jointLog] [info] Total # of Contigs : 11,353,511
[2023-03-15 20:35:19.728] [puff::index::jointLog] [info] Total # of numerical Contigs : 11,353,511
[2023-03-15 20:35:20.804] [puff::index::jointLog] [info] Total # of contig vec entries: 16,343,267
[2023-03-15 20:35:20.804] [puff::index::jointLog] [info] bits per offset entry 24
[2023-03-15 20:35:22.331] [puff::index::jointLog] [info] Done constructing the contig vector. 11353512
[2023-03-15 20:35:25.697] [puff::index::jointLog] [info] # segments = 11,353,511
[2023-03-15 20:35:25.697] [puff::index::jointLog] [info] total length = 2,210,067,304
[2023-03-15 20:35:28.518] [puff::index::jointLog] [info] Reading the reference files ...
[2023-03-15 20:35:37.482] [puff::index::jointLog] [info] positional integer width = 32
[2023-03-15 20:35:37.482] [puff::index::jointLog] [info] seqSize = 2,210,067,304
[2023-03-15 20:35:37.482] [puff::index::jointLog] [info] rankSize = 2,210,067,304
[2023-03-15 20:35:37.482] [puff::index::jointLog] [info] edgeVecSize = 0
[2023-03-15 20:35:37.482] [puff::index::jointLog] [info] num keys = 1,869,461,974
^M[Building BooPHF]  0.194%   elapsed:   0 min 0  sec   remaining:   3 min 39 sec^M[Building BooPHF]  0.206%   elapsed:   0 min 0  sec   remaining:   3 min 33 sec^M[Building BooPHF]  0.394%   elapsed:   0 min 1  sec   remaining:   2 min 45 sec^M[Building BooPHF]  0.406%   elapsed:   0 min 1  sec   remaining:   2 min 44 sec^M[Building BooPHF]  0.594%   elapsed:   0 min 1  sec   remaining:   2 m
psed:   0 min 56 sec   remaining:   1 min 16 sec^M[Building BooPHF]  42.4 %   elapsed:   0 min 56 sec   remaining:   1 min 16 sec^M[Building BooPHF]  42.6 %   elapsed:   0 min 56 sec   remaining:   1 min 15 sec^M[Building BooPHF]  42.6 %   elapsed:   0 min 56 sec   remaining:   1 min 15 sec^M[Building BooPHF]  42.8 %   elapsed:   0 min 56 sec   remaining:   1 min 15 sec^M[Building BooPHF]  42.
salmon index was invoked improperly.
For usage information, try salmon index --help

To Reproduce Steps and data to reproduce the behavior:

salmon index -t input.fa -i input.index

Specifically, please provide at least the following information:

  • Which version of salmon was used? - 1.10.1
  • How was salmon installed (compiled, downloaded executable, through bioconda)? - biconda
  • Which reference (e.g. transcriptome) was used? - metagenome
  • Which read files were used?
  • Which which program options were used?

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots or terminal output to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. Ubuntu Linux, OSX] Linux- HPC
  • Version [ If you are on OSX, the output of sw_vers. If you are on linux the output of uname -a and lsb_release -a]

Thanks. Ugur

ucabuk avatar Mar 15 '23 20:03 ucabuk