Whippet.jl
Whippet.jl copied to clipboard
Indexing error
Hi,
Generally indexing works fine. However I am frequently experiencing two bugs that might need attention:
- when the source gtf file contains only one gene in the gene_id field, whippet-index.jl crashes with a misleading error message:
ERROR: LoadError: ERROR: No genes in the GTF file matched chromosome names in the FASTA file! Stacktrace: [1] #trans_index!#37(::Int64, ::Function, ::BioSequences.FASTA.Reader, ::Dict{String,Whippet.RefGene}) at v0.6/Whippet/src/index.jl:100 [2] (::Whippet.#kw##trans_index!)(::Array{Any,1}, ::Whippet.#trans_index!, ::BioSequences.FASTA.Reader, ::Dict{String,Whippet.RefGene}) at ./<missing>:0 [3] macro expansion at ./util.jl:237 [inlined] [4] #fasta_to_index#38(::Int64, ::Function, ::String, ::Dict{String,Whippet.RefGene}) at v0.6/Whippet/src/index.jl:122 [5] (::Whippet.#kw##fasta_to_index)(::Array{Any,1}, ::Whippet.#fasta_to_index, ::String, ::Dict{String,Whippet.RefGene}) at ./<missing>:0 [6] macro expansion at v0.6/Whippet/src/timer.jl:5 [inlined] [7] main() at v0.6/Whippet/bin/whippet-index.jl:90 [8] include_from_node1(::String) at ./loading.jl:576 [9] include(::String) at ./sysimg.jl:14 [10] process_options(::Base.JLOptions) at ./client.jl:305 [11] _start() at ./client.jl:371while loading v0.6/Whippet/bin/whippet-index.jl, in expression starting on line 5
Adding a single random gene resolves this bug.
- Using an unfiltered human gtf from Gencode, resulted in the following error message:
... Building Splice Graphs for chr8.. 71.185649 seconds (12.70 M allocations: 66.904 GiB, 10.71% gc time) Building Splice Graphs for chr9.. 74.675011 seconds (12.69 M allocations: 70.337 GiB, 10.46% gc time) ERROR: LoadError: MethodError: no method matching isvalid(::UInt8) Closest candidates are: isvalid(!Matched::Type{String}, !Matched::Union{Array{UInt8,1}, String}) at strings/string.jl:123 isvalid(!Matched::Type{Char}, !Matched::Unsigned) at strings/utf8proc.jl:16 isvalid(!Matched::Type{Char}, !Matched::Integer) at strings/utf8proc.jl:17 ... Stacktrace: [1] encode_copy!(::BioSequences.BioSequence{BioSequences.DNAAlphabet{4}}, ::Int64, ::Array{UInt8,1}, ::Int64, ::Int64) at v0.6/BioSequences/src/bioseq/copying.jl:149 [2] BioSequences.BioSequence{BioSequences.DNAAlphabet{4}}(::Array{UInt8,1}, ::Int64, ::Int64) at v0.6/BioSequences/src/bioseq/constructors.jl:28 [3] #trans_index!#37(::Int64, ::Function, ::BioSequences.FASTA.Reader, ::Dict{String,Whippet.RefGene}) at v0.6/Whippet/src/index.jl:80 [4] (::Whippet.#kw##trans_index!)(::Array{Any,1}, ::Whippet.#trans_index!, ::BioSequences.FASTA.Reader, ::Dict{String,Whippet.RefGene}) at ./<missing>:0 [5] macro expansion at ./util.jl:237 [inlined] [6] #fasta_to_index#38(::Int64, ::Function, ::String, ::Dict{String,Whippet.RefGene}) at v0.6/Whippet/src/index.jl:122 [7] (::Whippet.#kw##fasta_to_index)(::Array{Any,1}, ::Whippet.#fasta_to_index, ::String, ::Dict{String,Whippet.RefGene}) at ./<missing>:0 [8] macro expansion at v0.6/Whippet/src/timer.jl:5 [inlined] [9] main() at v0.6/Whippet/bin/whippet-index.jl:90 [10] include_from_node1(::String) at ./loading.jl:576 [11] include(::String) at ./sysimg.jl:14 [12] process_options(::Base.JLOptions) at ./client.jl:305 [13] _start() at ./client.jl:371 while loading v0.6/Whippet/bin/whippet-index.jl, in expression starting on line 5
Many thanks!
Hi @ju-mu Just wanted to thank you for making a record of that first error message and its cause. I kept getting the same error and had no idea what was going on.
It's a shame that nobody has responded to the bugs you've mentioned.
Again, thanks very much!
-Andrei