libCacheSim
libCacheSim copied to clipboard
Add BeladyOnline
I have implemented Belady's MIN algorithm as follows:
for obj in trace:
if(max(occ_array[obj.last_access_vtime, current_vtime]) + obj.size <= cache_size):
# The object should be inserted into the cache at its last_access_vtime
occ_array[obj.last_access_vtime, current_vtime] += obj.size
# cache hit
else:
# cache miss
This implementation allows us to evaluate the performance of the Belady algorithm on raw traces. For example, we can run the following command:
./bin/cachesim ../data/cloudPhysicsIO.vscsi vscsi beladyOnline 0.001,0.002,0.005,0.01,0.02,0.05,0.1,0.2
Notes:
- The complexity of this implementation is $O(n*log(n))$, and the space overhead is $O(n)$, where $n$ is the number of requests in the trace.
- Compared to the existing offline Belady, MIN does not guarantee that requests will always be inserted into the cache, so the miss rate will have slight differences from the current offline implementation.
- Since each BeladyOnline instance requires $O(n)$ memory, running multiple instances will consume a significant amount of memory. For the
twitter_cluster45.oracleGeneral.sample10case, testing the following 8 sizes will consume approximately 20GB of memory in total.
Test Results
I tested the efficiency of BeladyOnline in constructing Request-MRC and Byte-MRC for cache size 0.001, 0.002, 0.005, 0.01, 0.02, 0.05, 0.1, and 0.2, using multiple workloads, and compared it with Belady.
| Workload | #req | Belady Time(s) | BeladyOnline Time(s) | Average MAE |
|---|---|---|---|---|
| cloudPhysicsIO.oracleGeneral.bin | 113872 | 2.3 | 2.3 | 0.03% |
| twitter_cluster45.oracleGeneral.sample10 | 22288116 | 14.3 | 29.9 | 0.004% |