docx
docx copied to clipboard
Exception thrown when calling to_html on file with internal hyperlinks
Describe the bug
undefined method `value' for nil:NilClass
error thrown when calling to_html on a file with internal hyperlinks (hyperlinks to a bookmark or a heading within the file).
Backtrace:
docx (0.8.0) lib/docx/containers/text_run.rb:106:in `hyperlink_id'
docx (0.8.0) lib/docx/containers/text_run.rb:102:in `href'
docx (0.8.0) lib/docx/containers/text_run.rb:81:in `to_html'
docx (0.8.0) lib/docx/containers/paragraph.rb:48:in `block in to_html'
docx (0.8.0) lib/docx/containers/paragraph.rb:47:in `each'
docx (0.8.0) lib/docx/containers/paragraph.rb:47:in `to_html'
docx (0.8.0) lib/docx/document.rb:119:in `map'
docx (0.8.0) lib/docx/document.rb:119:in `to_html'
According to here the anchor
attribute is used instead of the id
attribute for internal hyperlinks, breaking line 106 in text_run.rb.
To Reproduce
Open a docx file with a hyperlink to either a heading or a bookmark in the same file and call to_html.
example
require 'docx'
doc = Docx::Document.new('/path/to/your/docx/file_with_internal_hyperlink.docx')
doc.to_html
Sample docx file
https://docs.google.com/document/d/1H01zgmdC2LHAAwXAhmm6RyEz-lwbZm6R/edit?usp=sharing&ouid=103282161859668866778&rtpof=true&sd=true
Expected behavior
No exception thrown; html gets returned as normal.
Environment
- Ruby version: 3.2.2
-
docx
gem version: 0.8.0 - OS: Alpine 3.17 docker container
Hi @satoryu. Any idea what could be happening here? I seem to be having a similar problem on any docx version bigger than 0.5.0. 0.5.0 and older versions just sanitize the hyperlinks and print the plain text.
I'm on Ruby 3.1.4, Ubuntu 20.04.
Backtrace:
undefined method `[]' for nil:NilClass
@document_properties[:hyperlinks][hyperlink_id]
^^^^^^^^^^^^^^
docx-0.6.0/lib/docx/containers/text_run.rb:100:in `href'
docx-0.6.0/lib/docx/containers/text_run.rb:79:in `to_html'
docx-0.6.0/lib/docx/containers/paragraph.rb:48:in `block in to_html'
docx-0.6.0/lib/docx/containers/paragraph.rb:47:in `each'
docx-0.6.0/lib/docx/containers/paragraph.rb:47:in `to_html'
@ycp3 @mateusg Thank you for your reports.
I've just found out the root cause: this gem does not support internal links. I would like to fix this issue but need time.
I seem to be having a similar problem on any docx version bigger than 0.5.0. 0.5.0 and older versions just sanitize the hyperlinks and print the plain text.
Yes, right. Do you think that printing external links as sanitized text makes sense ?