Org.jl icon indicating copy to clipboard operation
Org.jl copied to clipboard

Parsetree - StringIndexError: invalid index ...

Open KyokoTomato opened this issue 3 years ago • 2 comments

The parsetree function seems to have issues with unicode documents in paragraphs, in specific positions Can be reproduced by running this code

using Org

parsetree(org"* Test Heading
This is padding. ’ This line causes issues thanks to that quote."); # U+2019

Producing this error:

ERROR: StringIndexError: invalid index [35], valid nearby indices [33]=>'’', [36]=>' '
Stacktrace:
  [1] SubString
    @ ./strings/substring.jl:38 [inlined]
  [2] SubString
    @ ./strings/substring.jl:44 [inlined]
  [3] SubString
    @ ./strings/substring.jl:40 [inlined]
  [4] getindex
    @ ./strings/substring.jl:255 [inlined]
  [5] structuredesc(t::Org.TextPlain{SubString{String}})
    @ Org ~/.julia/packages/Org/1t15S/src/analysis/parsetree.jl:93
  [6] parsetree(io::Base.TTY, component::Org.TextPlain{SubString{String}}, maxdepth::Int64, depth::Int64
)                                                                                                      
    @ Org ~/.julia/packages/Org/1t15S/src/analysis/parsetree.jl:63
  [7] _broadcast_getindex_evalf
    @ ./broadcast.jl:670 [inlined]
  [8] _broadcast_getindex
    @ ./broadcast.jl:643 [inlined]
  [9] getindex
    @ ./broadcast.jl:597 [inlined]
 [10] macro expansion
    @ ./broadcast.jl:961 [inlined]
 [11] macro expansion
    @ ./simdloop.jl:77 [inlined]
 [12] copyto!
    @ ./broadcast.jl:960 [inlined]
 [13] copyto!
    @ ./broadcast.jl:913 [inlined]
 [14] copy
    @ ./broadcast.jl:885 [inlined]
 [15] materialize
    @ ./broadcast.jl:860 [inlined]
 [16] printstructure(io::Base.TTY, component::String, description::Nothing, contents::Vector{Any}, maxdepth::Int64, depth::Int64, color::Symbol, bold::Bool)                                               
    @ Org ~/.julia/packages/Org/1t15S/src/analysis/parsetree.jl:33
 [17] parsetree(io::Base.TTY, component::Org.Paragraph, maxdepth::Int64, depth::Int64)
    @ Org ~/.julia/packages/Org/1t15S/src/analysis/parsetree.jl:63
 [18] _broadcast_getindex_evalf(::typeof(parsetree), ::Base.TTY, ::Org.Paragraph, ::Int64, ::Int64)
    @ Base.Broadcast ./broadcast.jl:670
 [19] _broadcast_getindex
    @ ./broadcast.jl:643 [inlined]
 [20] getindex
    @ ./broadcast.jl:597 [inlined]
 [21] macro expansion
    @ ./broadcast.jl:961 [inlined]
 [22] macro expansion
    @ ./simdloop.jl:77 [inlined]
 [23] copyto!
    @ ./broadcast.jl:960 [inlined]
 [24] copyto!
    @ ./broadcast.jl:913 [inlined]
 [25] copy
    @ ./broadcast.jl:885 [inlined]
 [26] materialize
    @ ./broadcast.jl:860 [inlined]
 [27] printstructure(io::Base.TTY, component::String, description::Nothing, contents::Vector{Union{Nothing, Org.Element}}, maxdepth::Int64, depth::Int64, color::Symbol, bold::Bool)                
    @ Org ~/.julia/packages/Org/1t15S/src/analysis/parsetree.jl:33
 [28] parsetree(io::Base.TTY, s::Org.Section, maxdepth::Int64, depth::Int64)
    @ Org ~/.julia/packages/Org/1t15S/src/analysis/parsetree.jl:44
 [29] _broadcast_getindex_evalf
    @ ./broadcast.jl:670 [inlined]
 [30] _broadcast_getindex
    @ ./broadcast.jl:643 [inlined]
 [31] getindex
    @ ./broadcast.jl:597 [inlined]
 [32] macro expansion
    @ ./broadcast.jl:961 [inlined]
 [33] macro expansion
    @ ./simdloop.jl:77 [inlined]
 [34] copyto!
    @ ./broadcast.jl:960 [inlined]
 [35] copyto!
    @ ./broadcast.jl:913 [inlined]
 [36] copy
    @ ./broadcast.jl:885 [inlined]
 [37] materialize
    @ ./broadcast.jl:860 [inlined]
 [38] printstructure(io::Base.TTY, component::String, description::String, contents::Vector{Org.Section}
, maxdepth::Int64, depth::Int64, color::Symbol, bold::Bool)                                            
    @ Org ~/.julia/packages/Org/1t15S/src/analysis/parsetree.jl:33
 [39] parsetree(io::Base.TTY, h::Org.Heading, maxdepth::Int64, depth::Int64)
    @ Org ~/.julia/packages/Org/1t15S/src/analysis/parsetree.jl:38
 [40] _broadcast_getindex_evalf
    @ ./broadcast.jl:670 [inlined]
 [41] _broadcast_getindex
    @ ./broadcast.jl:643 [inlined]
 [42] getindex
    @ ./broadcast.jl:597 [inlined]
 [43] macro expansion
    @ ./broadcast.jl:961 [inlined]
 [44] macro expansion
    @ ./simdloop.jl:77 [inlined]
 [45] copyto!
    @ ./broadcast.jl:960 [inlined]
 [46] copyto!
    @ ./broadcast.jl:913 [inlined]
 [47] copy
    @ ./broadcast.jl:885 [inlined]
 [48] materialize
    @ ./broadcast.jl:860 [inlined]
 [49] parsetree(io::Base.TTY, org::OrgDoc, maxdepth::Int64, depth::Int64)
    @ Org ~/.julia/packages/Org/1t15S/src/analysis/parsetree.jl:10
 [50] parsetree(io::Base.TTY, org::OrgDoc, maxdepth::Int64)
    @ Org ~/.julia/packages/Org/1t15S/src/analysis/parsetree.jl:5
 [51] parsetree(org::OrgDoc, maxdepth::Int64) (repeats 2 times)
    @ Org ~/.julia/packages/Org/1t15S/src/analysis/parsetree.jl:1

The position does matter. It won't happen if the quote is shifted around a bit, and seems to vary based on surrounding space characters. ( This is the same for other unicode characters, the right quote was just the character that caused issues with my main document )

KyokoTomato avatar Jun 12 '22 02:06 KyokoTomato

I hate these problems. This was my first time writing a parser of any kind in any language, I should probably do this a second time with a slightly different approach at some point...

tecosaur avatar Jan 28 '23 16:01 tecosaur