wordclouds
wordclouds copied to clipboard
Memory Leak?
I was making some wordclouds. But my memory usage was rising very fast. I think that was memory leak, and i found #10 Is the problem stil exist?
This is my pprof
flat flat% sum% cum cum%
0 0% 0.0054% 34775.93MB 88.79% github.com/psykhi/wordclouds.(*Wordcloud).Draw
0 0% 0.0054% 34775.93MB 88.79% github.com/psykhi/wordclouds.(*Wordcloud).Place
0 0% 0.0054% 34702.48MB 88.60% github.com/fogleman/gg.LoadFontFace
0 0% 0.0054% 34702.48MB 88.60% github.com/psykhi/wordclouds.(*Wordcloud).setFont
0.51MB 0.0013% 0.0067% 32529.44MB 83.05% github.com/golang/freetype/truetype.NewFace
32528.93MB 83.05% 83.06% 32528.93MB 83.05% image.NewAlpha
Thinks a lot.
I ran into this problem too, but I found that after a while the GC would reclaim some of the memory
Yeah, the GC does reclaim some of the memory, but much of it remains unreleased
Perhaps my font is too large, my memory usage can rise to 32G or higher when generating a wordcloud.
But after that, my memory usage may still remain at 16G or higher and could not been reclaimed.
I also met it when I use some specific font. What Could I do is to change another one to relieve memory usage.I do want the author would handle this problem.
I notice that #10 also mentions the memory leak problem.
Does it still exist some goroutines leak problem ?
Hello! It would be great with instructions to reproduce.
Sorry for reply so late.
type WordCloudConf struct {
FontMaxSize int `yaml:"font_max_size"`
FontMinSize int `yaml:"font_min_size"`
RandomPlacement bool `yaml:"random_placement"`
FontFile string `yaml:"font_file"`
Colors []color.RGBA
BackgroundColor color.RGBA `yaml:"background_color"`
Width int
Height int
Mask WordCloudMaskConf
SizeFunction *string `yaml:"size_function"`
Debug bool
}
var DefaultColors = []color.RGBA{
{0x1b, 0x1b, 0x1b, 0xff},
{0x48, 0x48, 0x4B, 0xff},
{0x59, 0x3a, 0xee, 0xff},
{0x65, 0xCD, 0xFA, 0xff},
{0x70, 0xD6, 0xBF, 0xff},
}
type WordCloudMaskConf struct {
File string
Color color.RGBA
}
var DefaultWordCloudConf = WordCloudConf{
FontMaxSize: 1024,
FontMinSize: 64,
RandomPlacement: false,
FontFile: "./font.ttf",
Colors: DefaultColors,
BackgroundColor: color.RGBA{255, 255, 255, 255},
Width: 2048,
Height: 1024,
Mask: WordCloudMaskConf{"", color.RGBA{R: 0, G: 0, B: 0, A: 0}},
Debug: false,
}
func MkWordCloud(words map[string]int) error {
fmt.Println(words)
conf := DefaultWordCloudConf
var boxes []*wordclouds.Box
if conf.Mask.File != "" {
boxes = wordclouds.Mask(
conf.Mask.File,
conf.Width,
conf.Height,
conf.Mask.Color)
}
colors := make([]color.Color, 0)
for _, c := range conf.Colors {
colors = append(colors, c)
}
c := []wordclouds.Option{
wordclouds.FontFile("./font.ttf"),
wordclouds.FontMaxSize(conf.FontMaxSize),
wordclouds.FontMinSize(conf.FontMinSize),
wordclouds.Colors(colors),
wordclouds.MaskBoxes(boxes),
wordclouds.Height(conf.Height),
wordclouds.Width(conf.Width),
wordclouds.RandomPlacement(conf.RandomPlacement),
wordclouds.BackgroundColor(conf.BackgroundColor)}
w := wordclouds.NewWordcloud(
words,
c...,
)
img := w.Draw()
// ...
}
and the arg is word.json word.json.zip
the font is https://raw.githubusercontent.com/adobe-fonts/source-han-sans/release/Variable/TTF/SourceHanSansSC-VF.ttf
it has been quite a long time. is this still a problem? i want to move logic from js to backend and this library seems to be the right one. but i do not want to waste time if it has such a significant bug.
it has been quite a long time. is this still a problem? i want to move logic from js to backend and this library seems to be the right one. but i do not want to waste time if it has such a significant bug.
Maybe the bug is still exist.
I can confirm there is a memory leak. My REST API server consumes 11.6MB memory by itself. When i call word cloud generation for 2048x1280 image with 256 words, the memory goes to 376MB, even 490MB, and does not go down for many minutes. Calling image generation couple of times will definitely cause OOM on the server. So this is definitely not suitable for production purposes or anything but single image generation via CLI.
The wordcloud.nextPos
uses multiple goroutines, so I would look there as the first thing. I am almost certain there are goroutines leaking.
@psykhi please fix this, the bug has been there at least since january of last year, when it was reported, and probably always.
it has been quite a long time. is this still a problem? i want to move logic from js to backend and this library seems to be the right one. but i do not want to waste time if it has such a significant bug.
maybe u can use this library
https://github.com/isaackd/wcloud
that's not a go but rust. so completely out of context. funnily enough, I wrote my own wordcloud over night. it almost works perfectly, except words overlap and i am using spiral matrix which causes right lower part of the image to be usually empty. so that's a no go, but it's 90% there :D
I completed my own word cloud. Now it is working as needed. But I too am observing memory leaking.
Turns out, it is the GG:
For each render I set up a new "service" and do the rendering, so the fonts are loaded per each request. That means the old run must be memory-freed by GC and this should not be an issue. Hence, it is indeed GG's fault.
Though it seems the memory will eventually get freed but it takes a really long time. So the same behavior as is happening with this library.
I rewrote the code to be a static service, so fonts are loaded just once per font size(this is quite a problem when each size requires new font). It helped tremendously, now it takes 10-20MB of ram per render. But that slow GC is a problem. The image.NewAlpha is from standard go package so i doubt @fogleman or @psykhi can do anything about it.
If i run runtime.GC() after each render, the memory is now manageable. Not a solution i like but it works.
Any idea how to bring this to the attention of the Go team? It is a serious problem.
It seems the problem is idling system. If there is not much going on, the GC won't run any time soon(supposedly 2 minutes). So calling GC manually after render is correct thing to do. If the server is heavily utilized, GC should work as expected.