This shows you the differences between two versions of the page.
|
ep:labs:03:contents:tasks:ex5 [2026/01/14 13:50] radu.mantu |
ep:labs:03:contents:tasks:ex5 [2026/01/14 16:59] (current) radu.mantu |
||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ==== 05. [10p] Bonus - Hardware Counters ==== | ==== 05. [10p] Bonus - Hardware Counters ==== | ||
| - | |||
| - | <note> | ||
| - | Solve the rest of the lab within the allotted time to unlock this bonus exercise ;) | ||
| - | </note> | ||
| A significant portion of the system statistics that can be generated involve hardware counters. As the name implies, these are special registers that count the number of occurrences of specific events in the CPU. These counters are implemented through **Model Specific Registers** (MSR), control registers used by developers for debugging, tracing, monitoring, etc. Since these registers may be subject to changes from one iteration of a microarchitecture to the next, we will need to consult chapters 18 and 19 from [[https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.html|Intel 64 and IA-32 Architectures Developer's Manual: Vol. 3B]]. | A significant portion of the system statistics that can be generated involve hardware counters. As the name implies, these are special registers that count the number of occurrences of specific events in the CPU. These counters are implemented through **Model Specific Registers** (MSR), control registers used by developers for debugging, tracing, monitoring, etc. Since these registers may be subject to changes from one iteration of a microarchitecture to the next, we will need to consult chapters 18 and 19 from [[https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.html|Intel 64 and IA-32 Architectures Developer's Manual: Vol. 3B]]. | ||
| Line 41: | Line 37: | ||
| Under normal circumstances, modifying Control Registers from userspace is not possible and you would have to write a kernel module for this. However, the [[https://man.archlinux.org/man/perf_event_open.2#perf_event_related_configuration_files|perf_event_open() man page]] documents a //sysfs// interface (i.e., ''/sys/bus/event_source/devices/cpu/rdpmc'') that does this for us. | Under normal circumstances, modifying Control Registers from userspace is not possible and you would have to write a kernel module for this. However, the [[https://man.archlinux.org/man/perf_event_open.2#perf_event_related_configuration_files|perf_event_open() man page]] documents a //sysfs// interface (i.e., ''/sys/bus/event_source/devices/cpu/rdpmc'') that does this for us. | ||
| - | Use the //sysfs// interface to revert the **rdpmc** access behavior to the pre-4.0 version. | + | Use the //sysfs// interface to revert the **RDPMC** access behavior to the pre-4.0 version. |
| + | |||
| + | <solution -hidden> | ||
| + | <code bash> | ||
| + | $ echo 2 | sudo tee /sys/bus/event_source/devices/cpu/rdpmc | ||
| + | </code> | ||
| + | </solution> | ||
| === Task C - Configure IA32_PERF_GLOBAL_CTRL === | === Task C - Configure IA32_PERF_GLOBAL_CTRL === | ||
| Line 96: | Line 98: | ||
| For the next (and //final// task) we are going to monitor the number of L2 cache misses. Look for the **L2_RQSTS.MISS** event in table 19-3 or 19-11 (depending on CPU version id) in the Intel manual and set the last two bytes (the unit mask and event select) accordingly. If the operation is successful and the counters have started, you should start seeing non-zero values in the PMC0 register, increasing in subsequent reads. | For the next (and //final// task) we are going to monitor the number of L2 cache misses. Look for the **L2_RQSTS.MISS** event in table 19-3 or 19-11 (depending on CPU version id) in the Intel manual and set the last two bytes (the unit mask and event select) accordingly. If the operation is successful and the counters have started, you should start seeing non-zero values in the PMC0 register, increasing in subsequent reads. | ||
| + | |||
| + | <note tip> | ||
| + | An easier alternative to scouring through the Intel manuals would be to use [[https://perfmon-events.intel.com/platforms/tigerlake/core-events/core/|perfmon-events.intel.com]]. Get your CPU //"model name"// from ''/proc/cpuinfo'' and identify your microarchitecture based on the table below. Then search for the desired event in the appropriate section of the site. | ||
| + | |||
| + | ^ Generation ^ Microarchitecture (Core Codename) ^ Release Year ^ Typical CPU Numbers ^ | ||
| + | | 1st | Nehalem / Westmere | 2008–2010 | i3,5,7 3xx–9xx | | ||
| + | | 2nd | Sandy Bridge | 2011 | i3,5,7 2xxx | | ||
| + | | 3rd | Ivy Bridge | 2012 | i3,5,7 3xxx | | ||
| + | | 4th | Haswell | 2013 | i3,5,7 4xxx | | ||
| + | | 5th | Broadwell | 2014–2015 | i3,5,7 5xxx | | ||
| + | | 6th | Skylake | 2015 | i3,5,7 6xxx | | ||
| + | | 7th | Kaby Lake | 2016–2017 | i3,5,7 7xxx | | ||
| + | | 8th | Coffee Lake / Amber Lake / Whiskey Lake | 2017–2018 | i3,5,7 8xxx | | ||
| + | | 9th | Coffee Lake Refresh | 2018–2019 | i3,5,7,9 9xxx | | ||
| + | | 10th | Comet Lake / Ice Lake / Tiger Lake | 2019–2020 | i3,5,7,9 10xxx | | ||
| + | | 11th | Rocket Lake / Tiger Lake | 2021 | i3,5,7,9 11xxx | | ||
| + | | 12th | Alder Lake | 2021–2022 | i3,5,7,9 12xxx | | ||
| + | | 13th | Raptor Lake | 2022–2023 | i3,5,7,9 13xxx | | ||
| + | | 14th | Raptor Lake Refresh | 2023–2024 | i3,5,7,9 14xxx | | ||
| + | | — | Meteor Lake | 2023–2024 | Core Ultra 5,7,9 1xx | | ||
| + | | — | Arrow Lake / Lunar Lake | 2024–2025 | Core Ultra 5,7,9 2xx | | ||
| + | </note> | ||
| <solution -hidden> | <solution -hidden> | ||
| Line 136: | Line 160: | ||
| <code C> | <code C> | ||
| /* hardware counter init */ | /* hardware counter init */ | ||
| - | rdpmc(ecx, eax, edx); | + | rdpmc(0, eax, edx); |
| counter = ((uint64_t)eax) | (((uint64_t)edx) << 32); | counter = ((uint64_t)eax) | (((uint64_t)edx) << 32); | ||
| Line 146: | Line 170: | ||
| /* hardware counter delta */ | /* hardware counter delta */ | ||
| - | rdpmc(ecx, eax, edx); | + | rdpmc(0, eax, edx); |
| counter = (((uint64_t)eax) | (((uint64_t)edx) << 32)) - counter; | counter = (((uint64_t)eax) | (((uint64_t)edx) << 32)) - counter; | ||
| </code> | </code> | ||