Name | Description | Counters usable | Unit mask options |
CPU_CLK_UNHALTED | Clock cycles when not halted | all | |
UNHALTED_REFERENCE_CYCLES | Unhalted reference cycles | 0, 1, 2 | 0x01: No unit mask |
LLC_MISSES | Last level cache demand requests from this core that missed the LLC | all | 0x41: No unit mask |
LLC_REFS | Last level cache demand requests from this core | all | 0x4f: No unit mask |
(Updates) LLC_MISSES is not well-documented by Intel. It seems to include L2 cache misses. Instead, one can use MEM_LOAD_RETIRED:0x10 to collect the number of retired loads that miss the last level cache. My measurement showed that LLC_MISSES can be ten times larger than MEM_LOAD_RETIRED:0x10.
Other useful metrics:
MEM_INST_RETIRED:0x01, the number of instructions with an architecturally-visible load retired on the architected path;
MEM_LOAD_RETIRED:0x04, llc_unshared_hit, the number of retired loads that hit their own, unshared lines in the LLC cache;
MEM_LOAD_RETIRED:0x08, other_core_l2_hit_hitm, the number of retired loads that hit in a sibling core's L2 (on die core);
MEM_LOAD_RETIRED:0x80, dtlb_miss, the number of retired loads that missed the DTLB;
MEM_UNCORE_RETIRED:0x08, remote_cache_local_home_hit, the number of memory load instructions retired where the memory reference missed the L1, L2 and LLC caches and HIT in a remote socket's cache;
MEM_UNCORE_RETIRED:0x10, remote_dram, the number of memory load instructions retired where the memory reference missed the L1, L2 and LLC caches and was remotely homed (dram);
MEM_UNCORE_RETIRED:0x20, local_dram, the number of memory load instructions retired where the memory reference missed the L1, L2 and LLC caches and required a local socket memory reference (dram);
No comments:
Post a Comment