Whippet.jl icon indicating copy to clipboard operation
Whippet.jl copied to clipboard

--biascorrect specific error: LoadError: BoundsError: attempt to access 32nt DNA Sequence:

Open rfara opened this issue 3 years ago • 4 comments

Hi Whippet team,

I've enjoyed using the tool, and grateful for the work you all did on this. I was trying to use the --biascorrect flag and kept experiencing this issue:

Whippet v1.0.0 loading and compiling... 31.457593 seconds. Loading splice graph index... /camp/lab/ulej/home/users/farawar/whippet/Whippet.jl/index/gencode_29.jls 4.482475 seconds (3.97 M allocations: 811.262 MiB, 29.69% gc time) Processing reads from file... FASTQ_1: /camp/project/proj-luscombe-ule/working/human/ENCODE_CRISPR_RNASeq/whippet/trim_galore/POLR2G-human_K562_REP2_ENCSR580TGX_F_trimmed.fq.gz FASTQ_2: /camp/project/proj-luscombe-ule/working/human/ENCODE_CRISPR_RNASeq/whippet/trim_galore/POLR2G-human_K562_REP2_ENCSR580TGX_R_trimmed.fq.gz ERROR: LoadError: BoundsError: attempt to access 32nt DNA Sequence: GCTTCACCGGCGCAGTCATTCTCATAATCGCC at index [28:33] Stacktrace: [1] checkbounds at /camp/home/farawar/.julia/packages/BioSequences/7i86L/src/bioseq/indexing.jl:15 [inlined] [2] Type at /camp/home/farawar/.julia/packages/BioSequences/7i86L/src/bioseq/constructors.jl:38 [inlined] [3] Type at /camp/home/farawar/.julia/packages/BioSequences/7i86L/src/bioseq/constructors.jl:46 [inlined] [4] getindex at /camp/home/farawar/.julia/packages/BioSequences/7i86L/src/bioseq/indexing.jl:64 [inlined] [5] primer_count!(::JointBiasMod, ::BioSequences.BioSequence{BioSequences.DNAAlphabet{2}}) at /camp/home/farawar/home/whippet/Whippet.jl/src/bias.jl:163 [6] count! at /camp/home/farawar/home/whippet/Whippet.jl/src/bias.jl:195 [inlined] [7] #process_paired_reads!#60(::Int64, ::Bool, ::Int64, ::Function, ::BioSequences.FASTQ.Reader, ::BioSequences.FASTQ.Reader, ::AlignParam, ::GraphLib, ::GraphLibQuant{SGAlignPaired,JointBiasCounter}, ::MultiMapping{SGAlignPaired,JointBiasCounter}, ::JointBiasMod) at /camp/home/farawar/home/whippet/Whippet.jl/src/reads.jl:106 [8] (::getfield(Whippet, Symbol("#kw##process_paired_reads!")))(::NamedTuple{(:sam, :qualoffset),Tuple{Bool,Int64}}, ::typeof(process_paired_reads!), ::BioSequences.FASTQ.Reader, ::BioSequences.FASTQ.Reader, ::AlignParam, ::GraphLib, ::GraphLibQuant{SGAlignPaired,JointBiasCounter}, ::MultiMapping{SGAlignPaired,JointBiasCounter}, ::JointBiasMod) at ./none:0 [9] main() at /camp/home/farawar/home/whippet/Whippet.jl/src/timer.jl:5 [10] top-level scope at /camp/home/farawar/home/whippet/Whippet.jl/src/timer.jl:5 [11] include at ./boot.jl:326 [inlined] [12] include_relative(::Module, ::String) at ./loading.jl:1038 [13] include(::Module, ::String) at ./sysimg.jl:29 [14] exec_options(::Base.JLOptions) at ./client.jl:267 [15] _start() at ./client.jl:436 in expression starting at /camp/home/farawar/home/whippet/Whippet.jl/bin/whippet-quant.jl:188

The problem doesn't occur when I omit the --biascorrect flag.

Thought I would mention this! Let me know if there is any info you need about my session or anything like that.

Cheers, Rupert

rfara avatar Mar 06 '21 14:03 rfara

I am running a whole bunch of samples through the same pipeline. Some of them make it past this part of whippet-quant --biascorrect, but then fail later for a different reason.

Whippet v1.0.0 loading and compiling... 28.360636 seconds. Loading splice graph index... /camp/lab/ulej/home/users/farawar/whippet/Whippet.jl/index/gencode_29.jls 4.236249 seconds (3.97 M allocations: 811.294 MiB, 30.73% gc time) Processing reads from file... FASTQ_1: /camp/project/proj-luscombe-ule/working/human/ENCODE_CRISPR_RNASeq/whippet/trim_galore/DGCR8-human_K562_REP2_ENCSR721CNY_F_trimmed.fq.gz FASTQ_2: /camp/project/proj-luscombe-ule/working/human/ENCODE_CRISPR_RNASeq/whippet/trim_galore/DGCR8-human_K562_REP2_ENCSR721CNY_R_trimmed.fq.gz 1113.238099 seconds (2.11 G allocations: 114.429 GiB, 9.47% gc time) ERROR: LoadError: InexactError: Int64(97.40703038718044) Stacktrace: [1] Type at ./float.jl:703 [inlined] [2] main() at /camp/home/farawar/home/whippet/Whippet.jl/bin/whippet-quant.jl:148 [3] top-level scope at /camp/home/farawar/home/whippet/Whippet.jl/src/timer.jl:5 [4] include at ./boot.jl:326 [inlined] [5] include_relative(::Module, ::String) at ./loading.jl:1038 [6] include(::Module, ::String) at ./sysimg.jl:29 [7] exec_options(::Base.JLOptions) at ./client.jl:267 [8] _start() at ./client.jl:436 in expression starting at /camp/home/farawar/home/whippet/Whippet.jl/bin/whippet-quant.jl:188

rfara avatar Mar 06 '21 15:03 rfara

Hey @rfara, You shouldn't be using trimmed reads-- Whippet in general expects all the reads to be the same length and can handle soft-clipping of its alignments anyways. That said, they haven't had to be the same length in previous versions of Whippet, but if you're going to use --biascorrect then they all have to at least be the minimum length of the model, which I think is 36nt. The second error is because the read lengths vary in the input file (producing an average read length as a floating point number) while Whippet assumes an integer read length-- In Julia v1.0+ parsing an int from a float throws an error it seems. Also I just released Whippet v1.5.1 for Julia v1.5.3 -- probably best to update to this version, as v1.0 isn't going to be supported any longer. Cheers, -t

timbitz avatar Mar 06 '21 17:03 timbitz

Thanks for getting back to me! I'll use the untrimmed reads in that case.

rfara avatar Mar 06 '21 18:03 rfara

Hello there,

I was having the same InexactError error, as the reads I was using are already trimmed. After the last commit, I can confirm this issue was fixed. Thanks a lot @timbitz :)

geparada avatar Mar 09 '21 10:03 geparada