rdoc icon indicating copy to clipboard operation
rdoc copied to clipboard

Optimize for strings without multibyte characters

Open sriedel opened this issue 6 years ago • 0 comments

Optimizes finding the character offset for strings that include no multibyte characters.

Note: I'm no expert in string encodings, but my naive assumption is if there are as many bytes in a string as there are characters, the requested character offset must be equal to the supplied byte offset. This assumption should hold for the majority of documentation written in english with UTF-8 encoding.

Motivation: generating ri documentation for the gem crack-0.4.3 took 156.3 seconds on my gen 6 i7 according to the rdoc output. Looking at the process with rubyspy, I saw that most of the time was being burned in RDoc::Markup::Parser#char_pos.

The output of rdoc with the original char_pos method:

~/.rvm/gems/ruby-2.6.3/gems/crack-0.4.3 $ time rdoc --ri
fatal: not a git repository (or any of the parent directories): .git
fatal: not a git repository (or any of the parent directories): .git
fatal: not a git repository (or any of the parent directories): .git
Parsing sources...
100% [22/22]  test/xml_test.rb

Generating RI format into /home/sr/.rdoc...

Files:      22

  Classes:     7 ( 6 undocumented)
  Modules:     2 ( 2 undocumented)
  Constants:   3 ( 2 undocumented)
  Attributes:  1 ( 1 undocumented)
  Methods:    11 (11 undocumented)

  Total:      24 (22 undocumented)
    8.33% documented

  Elapsed: 156.3s

 
real	2m36.989s
user	2m35.967s
sys	0m0.217s

With this change, the time to build ri documentation for the above mentioned gem is ~2.4 seconds:

~/.rvm/gems/ruby-2.6.3/gems/crack-0.4.3 $ time rdoc --ri 
Parsing sources...
100% [22/22]  test/xml_test.rb

Generating RI format into /home/sr/.rdoc...

  Files:      22

  Classes:     7 ( 6 undocumented)
  Modules:     2 ( 2 undocumented)
  Constants:   3 ( 2 undocumented)
  Attributes:  1 ( 1 undocumented)
  Methods:    11 (11 undocumented)

  Total:      24 (22 undocumented)
    8.33% documented

  Elapsed: 2.3s


real	0m2.798s
user	0m2.661s
sys	0m0.130s

sriedel avatar Jun 16 '19 07:06 sriedel