intel-cmt-cat icon indicating copy to clipboard operation
intel-cmt-cat copied to clipboard

Mismatch Between the LLC Misses and the MBM Counter

Open yangziy opened this issue 1 year ago • 0 comments
trafficstars

When I try to run a program that supposedly always misses the LLC, the LLC misses count does not match the MBM counter, the latter of which stays at zero. A minimal version of the program (40 LoC) is attached below.

# The program is pinned at core #3
 CORE         IPC      MISSES     LLC[KB]   MBL[MB/s]   MBR[MB/s]
   3        0.34      55601k     22464.0         0.0         0.0

The number of misses is 55601k, so I'd expect the MBL counter to be approximately 55601 * 1000 * 64 / (1024 ^ 2) = 3394MB/s. But it always stays at 0.

/* This program sequentially iterates over a 1GB buffer 
in a large stride so that it always accesses the same cache set */

#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>
#include <linux/mman.h>

#define SIZE_1GB (1024*1024*1024)

/* The MBM counter works as expected when the stride is 64B */
// #define STRIDE 64

/**
    Always access the same cache set
    This is specific for the Xeon 8275CL on EC2 c5.metal
        NUM_CL = LLC_SIZE / CL_SIZE = 35.75MB / 64B = 585728
        NUM_SETS = NUM_CL / NUM_WAYS = 585728 / 11 = 53248
        NUM_SETS * CL_STRIDE = 53248 * 64 = 3407872 = 0x340000
 */
#define STRIDE 0x340000


int main() {
    // Use a physically continuous page so that we can access the same cache set 
    register char *buf = (char *)mmap(/*addr*/ 0, /*len*/ SIZE_1GB,
                /*prot*/ PROT_READ | PROT_WRITE,
                /*flags*/ MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB | MAP_HUGE_1GB, /*fd*/ 0,
                /*offset*/ 0);
    register unsigned offset = 0;
    register uint64_t val = 10;

    while (1) {
        asm volatile("movq (%1), %0\n\t" : : "r"(val), "r"(buf + offset));
        offset += STRIDE;
        if (offset >= SIZE_1GB) {
            offset = 0;
        }
    }

    munmap(buf, SIZE_1GB);
}

=== Configuration ===

Platform: AWS c5.metal CPU: Xeon 8275CL OS And Kernel Version: Ubuntu 24.04 (6.8.0-1012-aws) PQoS library version: 6.0.0

yangziy avatar Sep 23 '24 21:09 yangziy