wordclouds icon indicating copy to clipboard operation
wordclouds copied to clipboard

Memory Leak?

Open FishZe opened this issue 2 years ago • 15 comments

I was making some wordclouds. But my memory usage was rising very fast. I think that was memory leak, and i found #10 Is the problem stil exist?

This is my pprof

      flat  flat%   sum%        cum   cum%
         0     0% 0.0054% 34775.93MB 88.79%  github.com/psykhi/wordclouds.(*Wordcloud).Draw
         0     0% 0.0054% 34775.93MB 88.79%  github.com/psykhi/wordclouds.(*Wordcloud).Place
         0     0% 0.0054% 34702.48MB 88.60%  github.com/fogleman/gg.LoadFontFace
         0     0% 0.0054% 34702.48MB 88.60%  github.com/psykhi/wordclouds.(*Wordcloud).setFont
    0.51MB 0.0013% 0.0067% 32529.44MB 83.05%  github.com/golang/freetype/truetype.NewFace
32528.93MB 83.05% 83.06% 32528.93MB 83.05%  image.NewAlpha

image

Thinks a lot.

FishZe avatar Jan 16 '23 10:01 FishZe

I ran into this problem too, but I found that after a while the GC would reclaim some of the memory

jizizr avatar Jan 17 '23 05:01 jizizr

Yeah, the GC does reclaim some of the memory, but much of it remains unreleased

Perhaps my font is too large, my memory usage can rise to 32G or higher when generating a wordcloud.

But after that, my memory usage may still remain at 16G or higher and could not been reclaimed.

FishZe avatar Jan 17 '23 05:01 FishZe

I also met it when I use some specific font. What Could I do is to change another one to relieve memory usage.I do want the author would handle this problem.

jizizr avatar Jan 17 '23 05:01 jizizr

I notice that #10 also mentions the memory leak problem.

Does it still exist some goroutines leak problem ?

FishZe avatar Jan 17 '23 06:01 FishZe

Hello! It would be great with instructions to reproduce.

psykhi avatar Jan 21 '23 11:01 psykhi

Sorry for reply so late.

type WordCloudConf struct {
	FontMaxSize     int    `yaml:"font_max_size"`
	FontMinSize     int    `yaml:"font_min_size"`
	RandomPlacement bool   `yaml:"random_placement"`
	FontFile        string `yaml:"font_file"`
	Colors          []color.RGBA
	BackgroundColor color.RGBA `yaml:"background_color"`
	Width           int
	Height          int
	Mask            WordCloudMaskConf
	SizeFunction    *string `yaml:"size_function"`
	Debug           bool
}

var DefaultColors = []color.RGBA{
	{0x1b, 0x1b, 0x1b, 0xff},
	{0x48, 0x48, 0x4B, 0xff},
	{0x59, 0x3a, 0xee, 0xff},
	{0x65, 0xCD, 0xFA, 0xff},
	{0x70, 0xD6, 0xBF, 0xff},
}

type WordCloudMaskConf struct {
	File  string
	Color color.RGBA
}

var DefaultWordCloudConf = WordCloudConf{
	FontMaxSize:     1024,
	FontMinSize:     64,
	RandomPlacement: false,
	FontFile:        "./font.ttf",
	Colors:          DefaultColors,
	BackgroundColor: color.RGBA{255, 255, 255, 255},
	Width:           2048,
	Height:          1024,
	Mask:            WordCloudMaskConf{"", color.RGBA{R: 0, G: 0, B: 0, A: 0}},
	Debug:           false,
}

func MkWordCloud(words map[string]int) error {
	fmt.Println(words)
	conf := DefaultWordCloudConf
	var boxes []*wordclouds.Box
	if conf.Mask.File != "" {
		boxes = wordclouds.Mask(
			conf.Mask.File,
			conf.Width,
			conf.Height,
			conf.Mask.Color)
	}
	colors := make([]color.Color, 0)
	for _, c := range conf.Colors {
		colors = append(colors, c)
	}
	c := []wordclouds.Option{
		wordclouds.FontFile("./font.ttf"),
		wordclouds.FontMaxSize(conf.FontMaxSize),
		wordclouds.FontMinSize(conf.FontMinSize),
		wordclouds.Colors(colors),
		wordclouds.MaskBoxes(boxes),
		wordclouds.Height(conf.Height),
		wordclouds.Width(conf.Width),
		wordclouds.RandomPlacement(conf.RandomPlacement),
		wordclouds.BackgroundColor(conf.BackgroundColor)}
	w := wordclouds.NewWordcloud(
		words,
		c...,
	)
	img := w.Draw()
	// ...
}

and the arg is word.json word.json.zip

the font is https://raw.githubusercontent.com/adobe-fonts/source-han-sans/release/Variable/TTF/SourceHanSansSC-VF.ttf

FishZe avatar Jan 29 '23 02:01 FishZe

it has been quite a long time. is this still a problem? i want to move logic from js to backend and this library seems to be the right one. but i do not want to waste time if it has such a significant bug.

ivanjaros avatar Sep 11 '24 17:09 ivanjaros

it has been quite a long time. is this still a problem? i want to move logic from js to backend and this library seems to be the right one. but i do not want to waste time if it has such a significant bug.

Maybe the bug is still exist.

FishZe avatar Sep 12 '24 11:09 FishZe

I can confirm there is a memory leak. My REST API server consumes 11.6MB memory by itself. When i call word cloud generation for 2048x1280 image with 256 words, the memory goes to 376MB, even 490MB, and does not go down for many minutes. Calling image generation couple of times will definitely cause OOM on the server. So this is definitely not suitable for production purposes or anything but single image generation via CLI.

The wordcloud.nextPos uses multiple goroutines, so I would look there as the first thing. I am almost certain there are goroutines leaking.

@psykhi please fix this, the bug has been there at least since january of last year, when it was reported, and probably always.

ivanjaros avatar Sep 13 '24 17:09 ivanjaros

it has been quite a long time. is this still a problem? i want to move logic from js to backend and this library seems to be the right one. but i do not want to waste time if it has such a significant bug.

maybe u can use this library

https://github.com/isaackd/wcloud

jizizr avatar Sep 14 '24 00:09 jizizr

that's not a go but rust. so completely out of context. funnily enough, I wrote my own wordcloud over night. it almost works perfectly, except words overlap and i am using spiral matrix which causes right lower part of the image to be usually empty. so that's a no go, but it's 90% there :D

ivanjaros avatar Sep 14 '24 12:09 ivanjaros

I completed my own word cloud. Now it is working as needed. But I too am observing memory leaking. Turns out, it is the GG: obrázok

For each render I set up a new "service" and do the rendering, so the fonts are loaded per each request. That means the old run must be memory-freed by GC and this should not be an issue. Hence, it is indeed GG's fault.

Though it seems the memory will eventually get freed but it takes a really long time. So the same behavior as is happening with this library.

ivanjaros avatar Sep 15 '24 07:09 ivanjaros

I rewrote the code to be a static service, so fonts are loaded just once per font size(this is quite a problem when each size requires new font). It helped tremendously, now it takes 10-20MB of ram per render. But that slow GC is a problem. The image.NewAlpha is from standard go package so i doubt @fogleman or @psykhi can do anything about it.

If i run runtime.GC() after each render, the memory is now manageable. Not a solution i like but it works.

ivanjaros avatar Sep 15 '24 09:09 ivanjaros

Any idea how to bring this to the attention of the Go team? It is a serious problem.

ivanjaros avatar Sep 16 '24 13:09 ivanjaros

It seems the problem is idling system. If there is not much going on, the GC won't run any time soon(supposedly 2 minutes). So calling GC manually after render is correct thing to do. If the server is heavily utilized, GC should work as expected.

ivanjaros avatar Sep 16 '24 14:09 ivanjaros