pdfcpu icon indicating copy to clipboard operation
pdfcpu copied to clipboard

api: Addwatermark concurrency

Open Chennnnnnnnnnnnnnnnnn opened this issue 2 years ago • 5 comments

An error was reported in 'Addwatermark' with concurrent scenarios:"fatal error: concurrent map writes"

position:/pkg/pdfcpu/createText.go:419 ttf.UsedGIDs[gid] = true

maybe use sync.map or how can i resolve it?

Chennnnnnnnnnnnnnnnnn avatar Jul 22 '22 06:07 Chennnnnnnnnnnnnnnnnn

Hello!

Please explain your usecase in order to reproduce this.

Thank you for using pdfcpu 💚

hhrutter avatar Jul 22 '22 07:07 hhrutter

Thanks for reply

It's a REST API for add PDF watermark, error can reproduce when concurrency more than one

if _, ok := font.UserFontMetrics["NSimSun"]; !ok {
        font.LoadUserFonts()
}

wm, err := pdfcpu.ParseTextWatermarkDetails(doc.WaterMarkStr, "sc:2 abs, op:.4, pos:c, points:30, rotation:45, fontname:NSimSun", onTop, 1)
	if err != nil {
		return outfile, err
	}

	inFile := bytes.NewReader(doc.File)
	outFile := bytes.NewBuffer(outfile)

	err = api.AddWatermarks(inFile, outFile, nil, wm, nil)
	if err != nil {
		return outfile, err
	}

Error info:

fatal error: concurrent map writes

Chennnnnnnnnnnnnnnnnn avatar Jul 22 '22 08:07 Chennnnnnnnnnnnnnnnnn

Please post the stacktrace you are getting for further analysis.

hhrutter avatar Jul 22 '22 20:07 hhrutter

Hi, got into the same problem (which is more general than just addwatermark). Maybe I can add a few more details:

metrics.go contains a global map // UserFontMetrics represents font metrics for TTF or OTF font files installed into UserFontDir. var UserFontMetrics = map[string]TTFLight{} this map is initialized on LoadUserFonts() with struct instances TTFLight. The TTFLight struct also contains a map called UsedGIDs which is read (ranged over) and written (ttf.UsedGIDs[gid] = true) in several places. This works fine in a single gorutine.

In a concurrent environment (like in the context of a handler as original poster has), the UserFontMetrics map is unique. That means that all TTFLight instances are shared between gorutines. As a consequence, it may happen that one of the gorutines wants to range over the UsedGIDs from TTFLight while another wants to set a value to it, hence the concurrent panic.

I can't see a way from the user space of this lib to go around this other than serialize completely the generation of pdf, which is not exactly a desirable thing to have in terms of performance.

Suggestion (with a big grain of salt, as I can't say I know the code): since the UsedGIDs map seems to be the only one written from the TTFLight structure (I might be wrong). It may be worth separating it from there and pass it in a context-like way so that it would be different for each gorutine. The requirement would then be that you cannot call different API methods in different gorutines for the same pdf you want to generate, but I think this would be acceptable.

Thanks for the lib, very useful :)

ami- avatar Oct 06 '22 10:10 ami-

Thanks for the input! 👍 Will be taken into consideration.

hhrutter avatar Oct 06 '22 19:10 hhrutter

This should be fixed with latest release.

hhrutter avatar May 06 '23 21:05 hhrutter

Isn't fully fixed yet. The UserFontMetrics is still being used without a lock.

fatal error: concurrent map writes

goroutine 29 [running]:
github.com/pdfcpu/pdfcpu/pkg/font.LoadUserFonts()
        /Users/user/go/pkg/mod/github.com/pdfcpu/[email protected]/pkg/font/metrics.go:247 +0x24a
github.com/pdfcpu/pdfcpu/pkg/pdfcpu/model.EnsureDefaultConfigAt({0xc005e1c060?, 0xc004f7d900?})
        /Users/user/go/pkg/mod/github.com/pdfcpu/[email protected]/pkg/pdfcpu/model/configuration.go:251 +0x150
github.com/pdfcpu/pdfcpu/pkg/pdfcpu/model.NewDefaultConfiguration()

semvis123 avatar Jul 12 '23 13:07 semvis123