stretto icon indicating copy to clipboard operation
stretto copied to clipboard

Dataset for tests and benchmarks

Open al8n opened this issue 2 years ago • 5 comments

We need some datasets that can be used to give more insight into the performance and the hit ratio when we add new features.

al8n avatar Apr 19 '23 09:04 al8n

You could try some widely adopted trace, for example ds1, s3 from arc paper. From my benchmark Ristretto is not good. If Stretto's implentmention is exactly same as Ristretto, I'm interested to see the results. My cache package, with benchmark results: https://github.com/Yiling-J/theine-go

Yiling-J avatar Apr 22 '23 02:04 Yiling-J

You could try some widely adopted trace, for example ds1, s3 from arc paper. From my benchmark Ristretto is not good. If Stretto's implentmention is exactly same as Ristretto, I'm interested to see the results. My cache package, with benchmark results: https://github.com/Yiling-J/theine-go

The low hit ratio for ristretto in your benchmark may be caused by the write buffer, in ristretto, if you insert an item, and then try to read this item, if the item is still in the write buffer, then you will get a miss.

al8n avatar Apr 22 '23 08:04 al8n

I agree, but from the image in their blog post and README, the hit ratio should be higher. If you use the same technique(write to buf first), and your benchmark shows similar result, I think we can confirm that. BTW both I and ben believe that write to map first is better, you can take a look ben's reply: https://www.reddit.com/r/golang/comments/12uql3y/theine_020_released_a_generic_cache_which_has/

Yiling-J avatar Apr 22 '23 08:04 Yiling-J

I agree, but from the image in their blog post and README, the hit ratio should be higher. If you use the same technique(write to buf first), and your benchmark shows similar result, I think we can confirm that. BTW both I and ben believe that write to map first is better, you can take a look ben's reply: https://www.reddit.com/r/golang/comments/12uql3y/theine_020_released_a_generic_cache_which_has/

Yeah, writing to map first is better. I was thinking of having a method that lets the cache can read the item from the write buffer, e.g. add an Arc for the item, and have a hashmap to store the item in the buffer and remove it when the item is handled. But I do not think the idea is good enough. I have appreciated it if there is any idea about this feature.

al8n avatar Apr 22 '23 08:04 al8n

I think it's just switching the order, first writing to map, then adding to write buffer. I think Ristretto write to buffer first because write buffer is a channel, they drop some Sets under high concurrency when channel is full. This improve write performance, but maybe not the excepted behavior for Ristretto users.

Yiling-J avatar Apr 22 '23 09:04 Yiling-J