sablon
sablon copied to clipboard
Inline placeholder image causes other normal images to be replaced
When you have an inline placeholder image to be replaced (i.e. within image replacement fields), other non-placeholder images in the document after the placeholder image can end up being replaced along with the placeholder image.
For example:
Template:
Expected:
Actual:
Attached is a modified images_template.docx that demonstrates the issue as shown above: images_template.docx
What seems to be happening here is that, when the placeholder image is inline with the tags, the start_field and end_field end up within the same w:p
tag and start_node and end_node are the same. When the ImageBlock
tries to collect the body
nodes, it pulls in the entire rest of the document, so that replace
ends up replacing the first image in every subsequent node.
The solution seems to be to update the body
method in blocks.rb
to check for this condition:
def body
return [] if start_node == end_node
However, this code is used by multiple other Block
subclasses, and I'm not an expert in WordML, so I'm not certain if it wouldn't cause problems for other blocks or situations.
Interesting I thought I fixed the inline replacement problem in #131 but hopefully I'll have time later this week to look into it further.
Thanks. I just realized there's another case that my proposed fix doesn't address - if there is another inline image preceding the placeholder image in the same paragraph. replace
searches the entire w:p
tag containing the placeholder and matches the first image, even though it's before the starting tag.
I think what's really needed is a more robust algorithm to walk the xml tree and pick only the nodes that are actually between the start and end nodes, in document order. I'm not sure exactly what that would look like yet, but I think some sort of modified depth first search traversal might do the trick.
This seems like it would be a common problem when parsing open office xml documents, perhaps there is a well known algorithm that can be reused.
Template:
Expected:
Actual (with proposed fix):