Differences

This shows you the differences between two versions of the page.

Link to this comparison view

ep:labs:01 [2020/08/18 18:27]
radu.mantu [Objectives]
ep:labs:01 [2025/02/12 00:00] (current)
cezar.craciunoiu
Line 1: Line 1:
-~~NOTOC~~ +====== Lab 01 - Plotting ​======
- +
-====== Lab 01 - CPU Monitoring (Linux) ​======+
  
 ===== Objectives ===== ===== Objectives =====
  
-  * Offer an introduction to Performance Monitoring +  * Offer an introduction to Numpy & matplotlib 
-  * Present the main CPU metrics and how to interpret them +  * Get you familiarised with the numpy API 
-  * Get you to use various tools for monitoring ​the performance of the CPU +  * Understand basic plotting ​with matplotlib
-  * Familiarize you with the x86 Hardware Performance Counters +
-===== Contents =====+
  
-{{page>:​ep:​labs:​01:​meta:​nav&​nofooter&​noeditbutton}} 
  
-===== Introduction ===== 
  
-==== 01. Performance Monitoring ==== 
-Performance monitoring is the process of regularly checking a set of metrics and tracking the overall health of a specific system. Monitoring is tightly coupled with performance tuning, and a Linux system administrator should be proficient in these two subjects, as one of their main responsibilities is to identify bottlenecks and find solutions to help the operating system surpass them. Pinpointing a Linux system bottleneck requires a deep understanding of how various components of this operating system work (e.g. how processes are scheduled on the CPU, how memory is managed , the way that I/O interrupts are handled, the details of network layer implementation,​ etc). From a high level, the main subsystems that you should think of when tuning are CPU, Memory, I/O and Network. 
  
-These four subsystems are vastly depending on each other and tuning the whole system implies keeping them in harmony. To quote a famous idiom, “a chain is no stronger than its weakest link”. Thus, when investigating a system performance issue, all the subsystems must be checked and analysed. 
  
-Being able to discover the bottleneck in a system requires also understanding of what types of processes are running on it. The application stack of a system can be broken down in two categories:+===== Python Scientific Computing Resources =====
  
-  * CPU Bound - performance is limited by the CPU +In this lab, we will study a new library in python that offers fast, memory efficient manipulation ​of vectorsmatrices and tensors: ​**numpy**. We will also study basic plotting ​of data using the most popular data visualization libraries in the python ecosystem: **matplotlib**. 
-     * Requires heavy use of the CPU (e.g. for batch processingmathematical operations, etc) +
-     e.g. High volume web servers +
-  ​I/O Bound - performance is limited by the I/O subsystem +
-     Requires heavy use of memory and storage system +
-     An I/O Bound application is usually processing large amounts ​of data +
-     An often behaviour is to use CPU resources for making I/O requests and to enter a sleeping state +
-     e.gDatabase applications+
  
-Before going further with the CPU specific metrics ​and tools, here is methodical approach which can guide you when tuning the performance ​of a system: +For scientific computing we need an environment that is easy to use, and provides ​couple ​of tools like manipulating data and visualizing results. 
-  * Understand the factors which affect the performance +Python is very easy to use, but the downside is that it's not fast at numerical computing. Luckily, we have very eficient libraries for all our use-cases.
-  * Create a baseline measurement with the normal performance of the system +
-  * Reproduce the issue and compare the measurements with the baseline ​to narrow down the bottleneck to a specific subsystem +
-  * Try a single change ​at a time and test the results+
  
-==== 02. Introducing the CPU and CPU Metrics ====+**Core computing libraries**
  
-Before looking at the numerous performance measurement tools present in the Linux operating system, it is important to understand some key concepts ​and metrics, along with their interpretation regarding the performance of the system.+  * numpy and scipy: scientific computing 
 +  * matplotlib: plotting library
  
-The kernel contains a scheduler which is in charge of scheduling two types of resources: interrupts and threads. The resources are assigned by the scheduler with different priorities. The following list presents the priorities:​ +**Machine Learning**
-  ​User Processes - All the processes running in the user space - having the lowest priority in the scheduling mechanism +
-  ​System Processes - All kernel processing +
-  ​Interrupts - Devices announcing the kernel that they are done processing+
  
 +  * sklearn: machine learning toolkit
 +  * tensorflow: deep learning framework developed by google
 +  * keras: deep learning framework on top of `tensorflow` for easier implementation
 +  * pytorch: deep learning framework developed by facebook
  
-=== Context Switches === 
  
-While executing a process, the necessary set of data is stored in registers on the processor and cache. This group of information is called a context. Each thread owns an allotted time quantum to spend on the CPU, and when the time finishes or it is preempted by a higher priority task, a new ready to run process will be scheduled. When the next process is scheduled to run, the context of the current will be stored and the context of the new one is restored to the registers, this process being named context switch. Having a great volume of context switching is not desired because the CPU has to flush its register and cache each time, to gain room for the new process, which leads to performance issues.+**Statistics and data analysis**
  
-=== The Run Queue ===+  * pandas: very popular data analysis library 
 +  * statsmodels:​ statistics
  
-Each CPU preserves its own run queue of threads. In an ideal scenario, the scheduler would be constantly executing threads. Threads can be in different statesrunnable - processes which are ready to be executed or in a sleep state - being blocked while waiting for I/O. If the system has performance issues or it’s overloaded, then the queue starts to fill up and a process thread will take longer to execute.+We also have advanced interactive environments:
  
-The same concept is known also as “load”. This term is measured by load average, which is a rolling average of the sum of the processes waiting to be processed and the processes waiting for uninterruptible task to be completed. Unix systems traditionally present the CPU load as 1-minute, 5-minute and 15-minute averages.+  * IPython: advanced python console 
 +  * Jupyter: notebooks in the browser
  
-=== CPU Utilisation ===+There are many more scientific libraries available.
  
-The CPU Utilisation is a meaningful metric to observe how the running processes make use of the given processing resources. You can find the following categories the vast majority of performance monitoring tools: 
-  * User time - the time percentage a CPU spends on user processes 
-     * High user time values are recommended because this usually means that the system carries out actual work 
-  * System time - the time percentage a CPU spends on kernel threads and interrupts 
-     ​* ​ High system time values could mean bottlenecks in the network and driver stack 
-  * Waiting I/O - the time percentage a CPU waits for a I/O event to occur 
-     * A system should not spend too much time waiting for I/O operations 
-  * Idle time - the time percentage a CPU spends waiting for tasks 
-  * Nice time - the time percentage spends on changing the priority and execution order of processes. It is often included in the user time 
  
 +Check out these cheetsheets for fast reference to the common libraries:
  
-==== 03. CPU Performance Monitoring ==== +**Cheat sheets:**
-The Linux distributions have various monitoring tools available. Some of the utilities deal with metrics in a single tool, providing well formatted output which eases the understanding of the system performance. Other tools are specialized on more specific metrics and give us detailed information. ​+
  
-Some of the most important Linux CPU performance monitoring tools:+  - [[https://​perso.limsi.fr/​pointal/​_media/​python:​cours:​mementopython3-english.pdf)|python]] 
 +  - [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Numpy_Python_Cheat_Sheet.pdf|numpy]] 
 +  - [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Python_Matplotlib_Cheat_Sheet.pdf|matplotlib]] 
 +  - [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Scikit_Learn_Cheat_Sheet_Python.pdf|sklearn]] 
 +  - [[https://​github.com/​pandas-dev/​pandas/​blob/​master/​doc/​cheatsheet/​Pandas_Cheat_Sheet.pdf|pandas]]
  
-^ Tool      ^ Most useful function ^  +**Other:**
-| vmstat ​   | System activity ​    | +
-| top    | Process activity ​    | +
-| uptime, w    | Average system load | +
-| ps, pstree ​   | Displays the processes ​    | +
-| iostat ​   | Average CPU load     | +
-| sar   | Collect and report system activity ​    | +
-| mpstat ​   | Multiprocessor usage     |+
  
 +  - [[https://​stanford.edu/​~shervine/​teaching/​cs-229/​refresher-probabilities-statistics|Probabilities & Stats Refresher]]
 +  - [[https://​stanford.edu/​~shervine/​teaching/​cs-229/​refresher-algebra-calculus|Algebra]]
  
  
-==== 04Examples ====+<​note>​This lab is organized in a Jupyer Notebook hosted on Google ColabYou will find there some intuitions and applications for numpy and matplotlib. Check out the Tasks section below.</​note>​
  
-Understanding how well a CPU is performing is a matter of interpreting the run queue, its utilisation,​ and the amount of context switching performed. Although performance is relative to baseline statistics, in the absence of these statistics, the following general performance expectations of a system can be used as a guideline:​ +===== Tasks =====
-  * Run Queues – A run queue should not have more than 3 threads queued per processor. For example, a dual processor system should not have more than 6 threads in the run queue. +
-  * CPU Utilisation – A fully utilised CPU should have the following utilisation distribution:​ +
-     * 65% – 70% User Time +
-     * 30% – 35% System Time +
-     ​* ​ 0% –  5% Idle Time +
-  * Context Switches – The amount of context switches is directly relevant to CPU utilisation. As long as the CPU sustains the previously presented utilisation distribution,​ it is acceptable to have a high amount of context switches.+
  
-The following two examples give interpretations of the outputs generated by **vmstat**.+{{namespace>:​ep:​labs:​01:​contents:​tasks&​nofooter&​noeditbutton}}
  
-=== Example A - Sustained CPU Utilisation === 
  
-{{ :​ep:​laboratoare:​ep1_poz1.png?​550 |}} 
-  
-The following observations can be made based on this output: 
-  * There are a high amount of interrupts (**in**) and a low amount of context switches (**cs**). It appears that a single process is making requests to hardware devices. 
-  * To further prove the presence of a single application,​ the user (**us**) time is constantly at 85% and above. Along with the low amount of context switches, we deduce that the process comes on the processor and stays on the processor. 
-  * The run queue is just about at the limits of acceptable performance. On a couple occasions, it goes beyond acceptable limits. 
  
-=== Example B - Overloaded Scheduler === 
- 
-{{ :​ep:​laboratoare:​ep1_poz2.png?​550 |}} 
-  
-The following observations can be made based on this output: 
-  * The amount of context switches is higher than interrupts, suggesting that the kernel has to spend a considerable amount of time context switching threads. 
-  * The high volume of context switches is causing an unhealthy balance of CPU utilisation. This is evident by the fact that the wait on IO percentage is extremely high and the user percentage is extremely low. 
-  * Because the CPU is blocked waiting for I/O, the run queue starts to fill and the amount of threads blocked waiting on I/O also fills. 
- 
-\\ 
-//These examples are from Darren Hoch’s [[http://​ufsdump.org/​papers/​oscon2009-linux-monitoring.pdf|Linux System and Performance Monitoring]].//​ 
- 
-===== Tasks ===== 
- 
-{{namespace>:​ep:​labs:​01:​contents:​tasks&​nofooter&​noeditbutton}} 
  
  
ep/labs/01.1597764454.txt.gz · Last modified: 2020/08/18 18:27 by radu.mantu
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0