go-weasyprint
go-weasyprint copied to clipboard
Fix font handling
I never used to have to deal with fonts in python, so not sure why I'm being forced to define all this stuff I don't want to deal with here.
fs, err := fc.LoadFontsetFile(fontmapCache) fontconfig := text.NewFontConfigurationPango(fcfonts.NewFontMap(fc.Standard.Copy(), fs))
All I want is a simple html to pdf here from string to file. err := pdf.HtmlToPdf(os.Stdout, utils.InputString(html), fs)
Your snippet is almost correct, just pass fontconfig instead of fs in HtmlToPdf
The python implementation uses C dependencies to handle fonts. This module uses a pure Go implementation which uses an on-disk cache to store font information. We have chosen to expose the path to the font cache.
I'm working towards enabling go-text as a replacement for the text engine, so that the FontConfiguration creation will slightly change in the future. The reference to fcfonts.NewFontMap and fc.Standard will not be needed anymore.
I don't have the fontmapCache file present so I can't get this example to work currently. Copying the test file exactly gives:
var fontconfig text.FontConfiguration
const fontmapCache = "pdf/test/cache.fc"
fs, _ := fc.LoadFontsetFile(fontmapCache)
fontconfig = text.NewFontConfigurationPango(fcfonts.NewFontMap(fc.Standard.Copy(), fs))
err := goweasyprint.HtmlToPdf(os.Stdout, utils.InputString(html), fontconfig)
panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xff864a]
goroutine 1 [running]: github.com/benoitkugler/textprocessing/pango.(*GlyphString).fallbackShape(0xc0008157c0, {0xc00013a0b0, 0x2c, 0x2c}, 0xc000694ff0) /home/User/go/pkg/mod/github.com/benoitkugler/[email protected]/pango/glyphs.go:213 +0x14a
See the file pdf/draw_test.go and the snippet :
// this command has to run once
fmt.Println("Scanning fonts...")
_, err := fc.ScanAndCache(fontmapCache)
if err != nil {
log.Fatal(err)
}
fs, err := fc.LoadFontsetFile(fontmapCache)
if err != nil {
log.Fatal(err)
}
fontconfig = text.NewFontConfigurationPango(fcfonts.NewFontMap(fc.Standard.Copy(), fs))
2024/07/09 09:44:17 invalid font dir /usr/share/texmf/fonts/opentype/public stat /usr/share/texmf/fonts/opentype/public: no such file or directory
Ye this font stuff is just really not working for me. Perhaps I just wait for the replacement of these parts ;p I got go-wkhtmltopdf working for now, but I'll come back to this one if I can ever get it working.
The error message is just a warning, it shouldn't fatal. What is the error returned by fc.ScanAndCache ?
Full example
func main() {
html := ""
var fontconfig text.FontConfiguration
const fontmapCache = "pdf/test/cache.fc"
fmt.Println("Scanning fonts...")
_, err := fc.ScanAndCache(fontmapCache)
if err != nil {
log.Fatal(err)
}
fs, err := fc.LoadFontsetFile(fontmapCache)
if err != nil {
log.Fatal(err)
}
fontconfig = text.NewFontConfigurationPango(fcfonts.NewFontMap(fc.Standard.Copy(), fs))
err = goweasyprint.HtmlToPdf(os.Stdout, utils.InputString(html), fontconfig)
}
Scanning fonts...
2024/07/09 21:59:38 invalid font dir /usr/share/texmf/fonts/opentype/public stat
/usr/share/texmf/fonts/opentype/public: no such file or directory
2024/07/09 21:59:39 open pdf/test/cache.fc: no such file or directory
exit status 1
Thank you for the full example.There is something strange though : only one fatal log should happen (since log.Fatal exit the program).
Could you be even more specific and print all the errors ? (That is add fmt.Println(err)) )
That was with empty html string, this link has a sample page in it. https://pastecode.dev/s/twxqkbfe
Scanning fonts...
2024/07/10 00:42:37 invalid font dir /usr/share/texmf/fonts/opentype/public stat /usr/share/texmf/fonts/opentype/public: no such file or directory
open pdf/test/cache.fc: no such file or directory
loading font set: open pdf/test/cache.fc: no such file or directory
webrender.progress: 2024/07/10 00:42:40 Step 1 - Fetching and parsing HTML
webrender.progress: 2024/07/10 00:42:40 Step 3 - Applying CSS - 1 sheet(s)
webrender.progress: 2024/07/10 00:42:40 Step 4 - Creating formatting structure
webrender.progress: 2024/07/10 00:42:40 Step 5 - Creating layout - Page 1
webrender.progress: 2024/07/10 00:42:40 Step 6 - Drawing pages
webrender.progress: 2024/07/10 00:42:40 Step 7 - Adding PDF metadata
%PDF-1.7
%����
4 0 obj
<</DecodeParms [ null ] /Filter [/FlateDecode] /Length 69 >>
stream
���� C��W��sfX��5���oʲ~SV=sT����dY{ɲ�%4!��4M�|����
endstream
endobj
3 0 obj
<<
/Type/Page
/Parent 2 0 R
/MediaBox [0 0 595.27563 841.88983]
/BleedBox [0 0 595.27563 841.88983]
/TrimBox [0 0 595.27563 841.88983]
/Contents [4 0 R]
>>
endobj
2 0 obj
<</Type/Pages/Count 1/Kids [3 0 R]>>
endobj
1 0 obj
<<
/Type/Catalog
/Pages 2 0 R
>>
endobj
5 0 obj
<<
/Producer (Go-WebRender 0.59)
>>
endobj
xref
0 6
0000000000 65535 f
0000000401 00000 n
0000000349 00000 n
0000000178 00000 n
0000000015 00000 n
0000000449 00000 n
trailer
<<
/Size 6
/Root 1 0 R
/Info 5 0 R
>>
startxref
500
Could you add the exact Go sample you use ? It still don't get why the program does not exit at the first log.Fatal.
Could you add the exact Go sample you use ? It still don't get why the program does not exit at the first log.Fatal.
package main
import (
"fmt"
"os"
goweasyprint "github.com/benoitkugler/go-weasyprint"
fc "github.com/benoitkugler/textprocessing/fontconfig"
"github.com/benoitkugler/textprocessing/pango/fcfonts"
"github.com/benoitkugler/webrender/text"
"github.com/benoitkugler/webrender/utils"
)
func main() {
html := `<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<title>My Website</title>
</head>
<body>
<main>
<h1>Welcome to My Website</h1>
</main>
</body>
</html>
`
var fontconfig text.FontConfiguration
const fontmapCache = "pdf/test/cache.fc"
fmt.Println("Scanning fonts...")
_, err := fc.ScanAndCache(fontmapCache)
if err != nil {
fmt.Println(err.Error())
}
fs, err := fc.LoadFontsetFile(fontmapCache)
if err != nil {
fmt.Println(err.Error())
}
fontconfig = text.NewFontConfigurationPango(fcfonts.NewFontMap(fc.Standard.Copy(), fs))
err = goweasyprint.HtmlToPdf(os.Stdout, utils.InputString(html), fontconfig)
if err != nil {
fmt.Println(err.Error())
}
}
The issue is here :
_, err := fc.ScanAndCache(fontmapCache)
if err != nil {
fmt.Println(err.Error())
}
I think you don't have the proper directories to match the font cache file defined as
const fontmapCache = "pdf/test/cache.fc"
Could you adjust this constant to something like <a directory I own/cache.fc> or maybe simply cache.fc ? Thank you.
~~but i dont have that file, and there would be no reason to given it was never explained in any doc anywhere?~~ nvm it might be working, ima test it at work in the morning.
Alrighty it wrote my file, but didn't process the inline css inside the string like wkhtml does.
// weasyprint
var fontconfig text.FontConfiguration
const fontmapCache = "cache.fc"
fmt.Println("Scanning fonts...")
_, err = fc.ScanAndCache(fontmapCache)
if err != nil {
return err
}
fs, err := fc.LoadFontsetFile(fontmapCache)
if err != nil {
return err
}
fontconfig = text.NewFontConfigurationPango(fcfonts.NewFontMap(fc.Standard.Copy(), fs))
file, err := os.Create(filename)
if err != nil {
return err
}
err = goweasyprint.HtmlToPdf(file, utils.InputString(buf.String()), fontconfig)
if err != nil {
return err
}
// wkhtml
pdfg, err := wkhtmltopdf.NewPDFGenerator()
if err != nil {
log.Fatal(err)
}
pdfg.AddPage(wkhtmltopdf.NewPageReader(strings.NewReader(buf.String())))
err = pdfg.Create()
if err != nil {
log.Fatal(err)
}
err = pdfg.WriteFile(filename)
if err != nil {
log.Fatal(err)
}
Can you post the exact html string you use ? I didn't grasp which CSS you are refering to.
Can you post the exact html string you use ? I didn't grasp which CSS you are refering to.
https://paste.ofcode.org/iCum4BQTjKeWcQkhexMVJp
Thank you. What is the CSS not processed by GoWeasyprint ?
PDF result: AU_PRD_RITM17270697.pdf
Wkhtml from same string generates correct coloring on each cell, and content. it just doesn't load properly for some reason.