Differences

This shows you the differences between two versions of the page.

Link to this comparison view

ep:labs:03:contents:tasks:ex2 [2019/10/13 17:57]
emilian.radoi
ep:labs:03:contents:tasks:ex2 [2025/03/17 20:59] (current)
radu.mantu [02. [30p] Mpstat]
Line 1: Line 1:
-==== 01. [10pMonitor IO with vmstat and iostat ​====+==== 02. [30pMpstat ​====
  
-=== [5p] Python ​script ​===+=== [10pTask A - Python ​recursion depth === 
 +Try to run the script while passing 1000 as a command line argument. Why does it crash?
  
-Write a Python script that reads the data into memory ​and generates ​text file 500 times larger, by concatenating ​the contents ​of {{:​ep:​labs:​olivertwist.txt|olivertwist.txt}} to itself.+Luckily, python allows you to both retrieve ​the current recursion limit //and// set new value for it. Increase ​the recursion limit so that the process will never crash, regardless ​of input (assume that it still has a reasonable upper bound).
  
-=== [5p] Monitoring the behaviour ===+<​solution -hidden>​ 
 +<code python>​ 
 +import sys
  
-Monitor ​the behaviour of the system while running your code using **vmstat** and **iostat**.+N = int(sys.argv[1]) 
 + 
 +sys.getrecursionlimit() 
 +sys.setrecursionlimit(N) 
 +</​code>​ 
 +</​solution>​ 
 + 
 +=== [10p] Task B - CPU affinity === 
 +Run the script again, this time passing 10000. Use **mpstat** to monitor ​the load on each //​individual//​ CPU at 1s intervals. The one with close to 100% load will be the one running ​our script. Note that the process might be passed around from one core to another. 
 + 
 +Stop the process. Use **stress** to create N-1 CPU workers, where N is the number of cores on your system. Use **taskset** to set the CPU affinity of the N-1 workers to CPUs 1-(N-1) ​and then run the script again. You should notice that the process is scheduled on cpu0. 
 + 
 +**Note**: to get the best performance when running a process, make sure that it stays on the same core for as long as possible. Don't let the scheduler decide this for you, if you can help it. Allowing it to bounce your process between cores can drastically impact the efficient use of the cache and the TLB. This holds especially true when you are working with servers rather than your personal PCs. While the problem may not manifest on a system with only 4 cores, you can't guarantee that it also won't manifest on one with 40 cores. When running several experiments in parallel, aim for something like this: 
 + 
 +{{:​ep:​labs:​01:​contents:​tasks:​affinity_good.png?​720|}} 
 +<​html><​center>​ 
 +<​b>​Figure 1:</​b>​ <​b>​htop</​b>​ output. Processes are bound to specific cores, increasing performance by not potentially invalidating L1 and L2 caches. This works out well since we have fewer active processes than available cores. Otherwise, setting the affinity to a single core may backfire; the rescheduling of these processes could be delayed until other processes are also allocated a time slice. We notice that CPU usage on these cores is maxed (green:user space, red:kernel space). The ratio tells us that a considerable amount of time is spent in kernel space, leading us to believe that the processes are I/O bound. 
 +</​center></​html>​
  
 <​solution -hidden> <​solution -hidden>
-<​code>​ + 
-if __name__ == '​__main__':​ +Start N-1 worker threads on cpu[1] - cpu[N-1]. Leave cpu[0] unused for when we run the script. 
-    text_file1 = open("​OliverTwist.txt",​ "​r"​) + 
-    text_file2 = open("​OliverTwistLarge.txt",​ "​w+"​) +<​code ​bash
-    lines_file1 = text_file1.readlines() +$ taskset 0xfe stress -c $(( $(nproc- 1 ))
-    for x in range(0, 500)+
-    text_file2.writelines(lines_file1)+
 </​code>​ </​code>​
 </​solution>​ </​solution>​
 +
 +=== [10p] Task C - USO flashbacks (2) ===
 +
 +Write a bash command that binds CPU **stress** workers on your odd-numbered cores (i.e.: 1,3,5,...). The list of cores and the number of stress workers must NOT be hardcoded, but constructed based on **nproc** (or whatever else you fancy). \\
 +In your submission, include both the bash command and a **mpstat** capture to prove that the command is working.
 +
 +<​solution -hidden>
 +<code bash>
 +$ cpu_list="​$(seq 1 2 $(nproc) | tr '​\n'​ ','​)";​ taskset -c ${cpu_list::​-1} stress -c $(($(nproc) / 2))
 +$ mpstat -P ALL
 +</​code>​
 +</​solution>​
 +
ep/labs/03/contents/tasks/ex2.1570978678.txt.gz · Last modified: 2019/10/13 17:57 by emilian.radoi
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0