gofpdf
gofpdf copied to clipboard
Combined words after WriteAligned merge
The example file Fpdf_SVGBasicWrite.pdf, generated by ExampleFpdf_SVGBasicWrite()
in fpdf_test.go, shows some words that should be separated by a space but are not. When compression is turned off, it is evident that the spaces have been deleted from the written text so this is not a rendering issue. The example generated by checkout Merge pull request #44 is correct; the example generated from the next checkout, Fpdf: add WriteAligned, is not.
@jelmersnoeck, I have not looked into this. Can you see what might be going on?
EDIT: Scratch this, I overlooked the without any
.
Could you take a screenshot of both examples? I've run the examples on 02db05c2c
, 04e0bd700
and master
and they all look the same.
I haven't dug deeper into it yet (running file comparisons), but am I correct in assuming you're seeing different output?
I'll look into comparing them later on (with the provided compare functionality in your compare-reference branch).
Compression on:
Compression off:
Oh, I actually just noticed it. the "without any". Will have a look.
This is output from what SplitLines gives back as an array of bytes (converted to a string):
It looks like the length varies here, and the combined words are end of lines combined with the next line. Will see why this happens.
Edit: SplitLines has no way of knowing how to deal with HTML tags. I will have a look at how this can be solved.
Yes, you're right that the culprit is clearly SplitLines()
-- it is breaking on (and consuming) HTML tags, and the two missing spaces in the text correspond to the end of lines that the method returns.
Turning compression off just allowed me to see the internal text in the generated PDF files. Sorry if that complicated the problem statement.
No worries. I will have a look if there's a way around determining the length with HTML.
I've thought of two things for this. One is a simple (partial) solution, the other a bit more complex.
-
In HTMLBasic, we should only use WriteAligned for
tags. There are no alignment tags in HTML otherwise, so this would always be left, which is done with the Write method. The downside of this is when we start mixing tags within tags. For example <center>A <a href="http://github.com">link</a> to GitHub</center>
would not be properly aligned. -
Same principle as above, but we add some more functionality to the center "manager". First we'll need to add a method
GetHTMLWidth
, which basically does the same as the HTMLBasic function. It parses the string from within the<center>
tag with all it's HTML in it. We will keep an internalwidth
counter until all tags are parsed, sum the with up from all tags and that's the total HTML width. We can then break up a sentence if it's too long (based on the available width). This needs to be smart though as we need to think about<b>
and<i>
tags etc. There is also the case of a<br>
tag within the<center>
tag.
I have already done 1) as it is fairly simple. I will have a look at how to implement step 2). https://github.com/jung-kurt/gofpdf/compare/master...jelmersnoeck:html-center-spaces?expand=1
I have already done 1) as it is fairly simple.
Excellent. I'll merge this to correct the existing issues and let you determine whether it is worth it to pursue the more involved solution.
Ok cool. I will have a look at this throughout the week.