Whippet.jl icon indicating copy to clipboard operation
Whippet.jl copied to clipboard

Indexing error

Open ju-mu opened this issue 4 years ago • 1 comments

Hi,

Generally indexing works fine. However I am frequently experiencing two bugs that might need attention:

  1. when the source gtf file contains only one gene in the gene_id field, whippet-index.jl crashes with a misleading error message: ERROR: LoadError: ERROR: No genes in the GTF file matched chromosome names in the FASTA file! Stacktrace: [1] #trans_index!#37(::Int64, ::Function, ::BioSequences.FASTA.Reader, ::Dict{String,Whippet.RefGene}) at v0.6/Whippet/src/index.jl:100 [2] (::Whippet.#kw##trans_index!)(::Array{Any,1}, ::Whippet.#trans_index!, ::BioSequences.FASTA.Reader, ::Dict{String,Whippet.RefGene}) at ./<missing>:0 [3] macro expansion at ./util.jl:237 [inlined] [4] #fasta_to_index#38(::Int64, ::Function, ::String, ::Dict{String,Whippet.RefGene}) at v0.6/Whippet/src/index.jl:122 [5] (::Whippet.#kw##fasta_to_index)(::Array{Any,1}, ::Whippet.#fasta_to_index, ::String, ::Dict{String,Whippet.RefGene}) at ./<missing>:0 [6] macro expansion at v0.6/Whippet/src/timer.jl:5 [inlined] [7] main() at v0.6/Whippet/bin/whippet-index.jl:90 [8] include_from_node1(::String) at ./loading.jl:576 [9] include(::String) at ./sysimg.jl:14 [10] process_options(::Base.JLOptions) at ./client.jl:305 [11] _start() at ./client.jl:371while loading v0.6/Whippet/bin/whippet-index.jl, in expression starting on line 5

Adding a single random gene resolves this bug.

  1. Using an unfiltered human gtf from Gencode, resulted in the following error message: ... Building Splice Graphs for chr8.. 71.185649 seconds (12.70 M allocations: 66.904 GiB, 10.71% gc time) Building Splice Graphs for chr9.. 74.675011 seconds (12.69 M allocations: 70.337 GiB, 10.46% gc time) ERROR: LoadError: MethodError: no method matching isvalid(::UInt8) Closest candidates are: isvalid(!Matched::Type{String}, !Matched::Union{Array{UInt8,1}, String}) at strings/string.jl:123 isvalid(!Matched::Type{Char}, !Matched::Unsigned) at strings/utf8proc.jl:16 isvalid(!Matched::Type{Char}, !Matched::Integer) at strings/utf8proc.jl:17 ... Stacktrace: [1] encode_copy!(::BioSequences.BioSequence{BioSequences.DNAAlphabet{4}}, ::Int64, ::Array{UInt8,1}, ::Int64, ::Int64) at v0.6/BioSequences/src/bioseq/copying.jl:149 [2] BioSequences.BioSequence{BioSequences.DNAAlphabet{4}}(::Array{UInt8,1}, ::Int64, ::Int64) at v0.6/BioSequences/src/bioseq/constructors.jl:28 [3] #trans_index!#37(::Int64, ::Function, ::BioSequences.FASTA.Reader, ::Dict{String,Whippet.RefGene}) at v0.6/Whippet/src/index.jl:80 [4] (::Whippet.#kw##trans_index!)(::Array{Any,1}, ::Whippet.#trans_index!, ::BioSequences.FASTA.Reader, ::Dict{String,Whippet.RefGene}) at ./<missing>:0 [5] macro expansion at ./util.jl:237 [inlined] [6] #fasta_to_index#38(::Int64, ::Function, ::String, ::Dict{String,Whippet.RefGene}) at v0.6/Whippet/src/index.jl:122 [7] (::Whippet.#kw##fasta_to_index)(::Array{Any,1}, ::Whippet.#fasta_to_index, ::String, ::Dict{String,Whippet.RefGene}) at ./<missing>:0 [8] macro expansion at v0.6/Whippet/src/timer.jl:5 [inlined] [9] main() at v0.6/Whippet/bin/whippet-index.jl:90 [10] include_from_node1(::String) at ./loading.jl:576 [11] include(::String) at ./sysimg.jl:14 [12] process_options(::Base.JLOptions) at ./client.jl:305 [13] _start() at ./client.jl:371 while loading v0.6/Whippet/bin/whippet-index.jl, in expression starting on line 5

Many thanks!

ju-mu avatar Sep 05 '19 09:09 ju-mu

Hi @ju-mu Just wanted to thank you for making a record of that first error message and its cause. I kept getting the same error and had no idea what was going on.

It's a shame that nobody has responded to the bugs you've mentioned.

Again, thanks very much!

-Andrei

andreismol avatar Jun 22 '20 05:06 andreismol