bigcache icon indicating copy to clipboard operation
bigcache copied to clipboard

FeatureRequest: provide a way to reuse byte slice allocated in readEntry

Open choleraehyq opened this issue 4 years ago • 7 comments

Currently we allocate a byte slice directly in encoding.go:readEntry. Can we provide a way to reuse slice here?

Here are my two possible solutions. First one is provide a new function like

func (c *BigCache) GetWithReuseFunc(key string) (value []byte,reuseFunc func(*[]byte),err error)

once users don't need value anymore, they can use reuseFunc(&value) to give the slice back to BigCache, internally we can maintain some sync.Pools to reuse different sized slices.

Second one is

func (c *BigCache) GetWithPreallocatedSlice(key string, preallocated []byte) (value []byte, err error)

users can provide a preallocated slice to that function, just like io.Reader or https://github.com/golang/snappy/blob/master/decode.go#L57. If that slice is large enough, we can use and return it directly, otherwise, we return a newly allocated slice.

choleraehyq avatar Jun 19 '20 09:06 choleraehyq

Hi @choleraehyq thank you for the idea. I had this idea in mind, but completely left it. Currently I'm not 100% sure this will work with the underlying bytes queue (used internally), but if everything is fine, this feature is pretty useful (with a sync.Pool or any byte pool outside of BigCache).

I think the second API is more clear and is pretty popular, I think we can easily try to make it (unless there is no conflict with bytes queue behaviour).

cristaloleg avatar Jun 19 '20 09:06 cristaloleg

I don't think this will work as intended with the internal BytesQueue because, and I could be wrong, we shard data across multiple queues to help spread things out more evenly, so passing in a slice to the queue would break things by default. If you added the slice to the end of the queue, that could work, but then the cache configurations wouldn't be correct.

Since you have to pass byte slices anyways, what is the specific use case that would require the difference between what you're suggesting vs what already exists? Are you asking if bigcache could reference a slice it isn't inherently managing?

siennathesane avatar Jun 22 '20 19:06 siennathesane

If I am not mistaken something similar could be achieved with slightly different approach without changing underlying byte queue too much by providing "processing" closure to the API. This function runs under lock and it is up to the user if actual copy needs to be made and where to keep copied data. bigcache itself while working under lock does not ever need to copy data.

type CacheEntry struct {
	TS   uint64
	Hash uint64
	Key  []byte
	Data []byte
}

type Processor func(*CacheEntry) error

func (c *BigCache) GetWithProcessing(key string, processor Processor) error {
	hashedKey := c.hash.Sum64(key)
	shard := c.getShard(hashedKey)
	_, err := shard.get(key, hashedKey, processor)
	return err
}

For testing purposes I implemented this in my fork - seems to work very well.

rupor-github avatar Aug 12 '20 13:08 rupor-github

@rupor-github I like this approach: it's backward compatible and easy to implement.

On the other hand it's really powerful and might cause a hard to debug problems if not used correctly (e.g. when Processor changes data)

janisz avatar Aug 14 '20 14:08 janisz

I do like the "processor" idea, but I am also concerned that it opens up too much surface area for issues when not used correctly. How will we surface an error from the user's processor implementation in a way which is fundamentally separate from a BigCache issue?

siennathesane avatar Aug 31 '20 16:08 siennathesane

@mxplusb - I share your concerns, however I do not see any efficient way to give user control over low level memory and have safety guarantees (with require explicit memory duplication). In my fork I implemented everything (including internal access) in terms of CacheEntry and defined set of Copy routines (replacing readKey, readEntry, etc) on it hoping to provide better documentation. However this is only documentation. Pinning entries in queue IMHO requires much more complex queue implementation/strategies.

rupor-github avatar Sep 02 '20 12:09 rupor-github

Let's get this into the discussion for v3, I think it's a viable design decision, but I want to save it for a major version as it's nuanced and I want to make sure it's very well documented with examples.

// ref: #238

siennathesane avatar Sep 04 '20 16:09 siennathesane