pdf
pdf copied to clipboard
where is isSameSentence() ?
I have the same question too!
I'm not sure what's the actual intended behaviour, but this worked for me at some level (although I ended up manually parsing the text output of GetTextByRow
instead)
func isSameSentence(text pdf.Text, lastTextStyle pdf.Text) bool {
return (text.Font == lastTextStyle.Font) && (text.FontSize == lastTextStyle.FontSize) && (text.X == lastTextStyle.X)
}
For future visitors, the above isSameSentence
isn't quite on the mark. The above definition prints the font, font-size, x, and y coords of each character of text in the PDF.
It might be useful to say that something is of the same sentence if it has the same font and font-size. In which case the function definition you'd want would be
func isSameSentence(text pdf.Text, lastTextStyle pdf.Text) bool {
return (text.Font == lastTextStyle.Font) && (text.FontSize == lastTextStyle.FontSize)
}
That really isn't true to the definition of "sameSentence" here, so you may want to check to see if a period was present in lastTextStyle before return true and effectively adding on the character to the text that get's printed along-side it's text style.
func isSameSentence(text pdf.Text, lastTextStyle pdf.Text) bool {
return (text.Font == lastTextStyle.Font) && (text.FontSize == lastTextStyle.FontSize) && strings.Contains(lastTextStyle, ".")
}