02. [30p] Mpstat

Open fact_rcrs.zip and look at the code.

[10p] Task A - Python recursion depth

Try to run the script while passing 1000 as a command line argument. Why does it crash?

Luckily, python allows you to both retrieve the current recursion limit and set a new value for it. Increase the recursion limit so that the process will never crash, regardless of input (assume that it still has a reasonable upper bound).

[10p] Task B - CPU affinity

Run the script again, this time passing 10000. Use mpstat to monitor the load on each individual CPU at 1s intervals. The one with close to 100% load will be the one running our script. Note that the process might be passed around from one core to another.

Stop the process. Use stress to create N-1 CPU workers, where N is the number of cores on your system. Use taskset to set the CPU affinity of the N-1 workers to CPUs 1-(N-1) and then run the script again. You should notice that the process is scheduled on cpu0.

Note: to get the best performance when running a process, make sure that it stays on the same core for as long as possible. Don't let the scheduler decide this for you, if you can help it. Allowing it to bounce your process between cores can drastically impact the efficient use of the cache and the TLB. This holds especially true when you are working with servers rather than your personal PCs. While the problem may not manifest on a system with only 4 cores, you can't guarantee that it also won't manifest on one with 40 cores. When running several experiments in parallel, aim for something like this:

Figure 1: htop output. Processes are bound to specific cores, increasing performance by not potentially invalidating L1 and L2 caches. This works out well since we have fewer active processes than available cores. Otherwise, setting the affinity to a single core may backfire; the rescheduling of these processes could be delayed until other processes are also allocated a time slice. We notice that CPU usage on these cores is maxed (green:user space, red:kernel space). The ratio tells us that a considerable amount of time is spent in kernel space, leading us to believe that the processes are I/O bound.

[10p] Task C - USO flashbacks (2)

Write a bash command that binds CPU stress workers on your odd-numbered cores (i.e.: 1,3,5,…). The list of cores and the number of stress workers must NOT be hardcoded, but constructed based on nproc (or whatever else you fancy).
In your submission, include both the bash command and a mpstat capture to prove that the command is working.

ep/labs/01/contents/tasks/ex2.txt · Last modified: 2022/09/13 11:58 by radu.mantu
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0