golang-set
golang-set copied to clipboard
Set map capacity for some functions can improve performance
Recently I use golang-set to help optimize a DFA generator.
And I noticed DFA‘s compile time increased a lot.
New code:

Before:

After:

pprof:

golang-set based on map. I guess frequently insert caused rehashes and memory copies.
I tried to reuse nextKeys and failNext, but nothing happened. I found this from golang-set's code:

Every time I clear the set, a new one created to replace the old one.
As we know, golang never reclaim memory from alive maps, so I use a for loop to remove all items from set instead of another create. But, new clear is more slower than before because remove loop takes more time.
I wrote a new clear function to reduce realloc of map:



This Time, results are better than before:


Saved about 1s, add function of set still takes biggest part of times. I suppose 2~3s will be saved if we set a proper capacity for Difference, Intersect, Union, etc.
P.S. The conclusion of first version is wrong, I'm sorry for my fault.
So what modifications to the API are you proposing ultimately?
Your usecase seems to be somewhat specific and maintaining backwards compatibility is important.
Your analysis is detailed but it’s still not entirely clear to me what you want changed?
Would you like to submit a PR? We can review the code together.
@deckarep If a function create a new mapset based on old mapsets, specify a size can improve performance.
Okay, I will look at adding these optimizations...in the meantime if you want to submit a PR to get this done faster that would be super helpful!
Thanks!
@deckarep Has this improvement been mergeed into the release ?
@OLPMO - not yet it’s sitting in PR #63. Would you like to review the changes and let me know how it looks?
@fy0 - closing this as the proposed work has finally been merged.