Optimize for strings without multibyte characters
Optimizes finding the character offset for strings that include no multibyte characters.
Note: I'm no expert in string encodings, but my naive assumption is if there are as many bytes in a string as there are characters, the requested character offset must be equal to the supplied byte offset. This assumption should hold for the majority of documentation written in english with UTF-8 encoding.
Motivation: generating ri documentation for the gem crack-0.4.3 took 156.3 seconds on my gen 6 i7 according to the rdoc output. Looking at the process with rubyspy, I saw that most of the time was being burned in RDoc::Markup::Parser#char_pos.
The output of rdoc with the original char_pos method:
~/.rvm/gems/ruby-2.6.3/gems/crack-0.4.3 $ time rdoc --ri
fatal: not a git repository (or any of the parent directories): .git
fatal: not a git repository (or any of the parent directories): .git
fatal: not a git repository (or any of the parent directories): .git
Parsing sources...
100% [22/22] test/xml_test.rb
Generating RI format into /home/sr/.rdoc...
Files: 22
Classes: 7 ( 6 undocumented)
Modules: 2 ( 2 undocumented)
Constants: 3 ( 2 undocumented)
Attributes: 1 ( 1 undocumented)
Methods: 11 (11 undocumented)
Total: 24 (22 undocumented)
8.33% documented
Elapsed: 156.3s
real 2m36.989s
user 2m35.967s
sys 0m0.217s
With this change, the time to build ri documentation for the above mentioned gem is ~2.4 seconds:
~/.rvm/gems/ruby-2.6.3/gems/crack-0.4.3 $ time rdoc --ri
Parsing sources...
100% [22/22] test/xml_test.rb
Generating RI format into /home/sr/.rdoc...
Files: 22
Classes: 7 ( 6 undocumented)
Modules: 2 ( 2 undocumented)
Constants: 3 ( 2 undocumented)
Attributes: 1 ( 1 undocumented)
Methods: 11 (11 undocumented)
Total: 24 (22 undocumented)
8.33% documented
Elapsed: 2.3s
real 0m2.798s
user 0m2.661s
sys 0m0.130s