Differences

This shows you the differences between two versions of the page.

Link to this comparison view

ep:labs:01 [2021/10/11 12:56]
radu.mantu [Proof of Work]
ep:labs:01 [2025/02/12 00:00] (current)
cezar.craciunoiu
Line 1: Line 1:
-~~NOTOC~~ +====== Lab 01 - Plotting ​======
- +
-====== Lab 01 - CPU Monitoring (Linux) ​======+
  
 ===== Objectives ===== ===== Objectives =====
  
-  * Offer an introduction to Performance Monitoring +  * Offer an introduction to Numpy & matplotlib 
-  * Present the main CPU metrics and how to interpret them +  * Get you familiarised with the numpy API 
-  * Get you to use various tools for monitoring ​the performance of the CPU +  * Understand basic plotting ​with matplotlib
-  * Familiarize you with the x86 Hardware Performance Counters+
  
-===== Contents ===== 
  
-{{page>:​ep:​labs:​01:​meta:​nav&​nofooter&​noeditbutton}} 
  
-===== Proof of Work ===== 
  
-Before you start, create a [[http://​docs.google.com/​|Google Doc]]. Here, you will add screenshots / code snippets / comments for each exercise. Whatever you decide to include, it must prove that you managed to solve the given task (so don't show just the output, but how you obtained it and what conclusion can be drawn from it). When done, export the document as a //pdf// and upload in the appropriate assignment on [[https://​curs.upb.ro/​2021/​course/​view.php?​id=5665#​section-2|moodle]]. 
  
-NOTE: this being the first lab, you have time until 11:55pm to upload your work. Starting next week, the cut-off time will be 15m after the lab ends. +===== Python Scientific Computing Resources ​=====
-===== Introduction ​=====+
  
-==== 01. Performance Monitoring ==== +In this labwe will study new library ​in python that offers fastmemory efficient manipulation ​of vectors, matrices ​and tensors: **numpy**We will also study basic plotting ​of data using the most popular data visualization libraries in the python ecosystem: **matplotlib**
-Performance monitoring is the process of regularly checking a set of metrics and tracking the overall health of a specific system. Monitoring is tightly coupled with performance tuningand Linux system administrator should be proficient ​in these two subjectsas one of their main responsibilities is to identify bottlenecks ​and find solutions to help the operating system surpass themPinpointing a Linux system bottleneck requires a deep understanding ​of how various components of this operating system work (e.g. how processes are scheduled on the CPU, how memory is managed , the way that I/O interrupts are handled, the details of network layer implementation,​ etc). From a high level, the main subsystems that you should think of when tuning are CPU, Memory, I/O and Network.+
  
-These four subsystems are vastly depending on each other and tuning the whole system implies keeping them in harmonyTo quote a famous idiom“a chain is no stronger than its weakest link”Thus, when investigating a system performance issue, all the subsystems must be checked and analysed.+For scientific computing we need an environment that is easy to use, and provides a couple of tools like manipulating data and visualizing results. 
 +Python is very easy to usebut the downside ​is that it's not fast at numerical computingLuckilywe have very eficient libraries for all our use-cases.
  
-Being able to discover the bottleneck in a system requires also understanding of what types of processes are running on it. The application stack of a system can be broken down in two categories:+**Core computing libraries**
  
-  * CPU Bound - performance is limited by the CPU +  * numpy and scipy: scientific computing 
-     * Requires heavy use of the CPU (e.g. for batch processing, mathematical operations, etc) +  * matplotlib: plotting library
-     * e.g. High volume web servers +
-  * I/O Bound - performance is limited by the I/O subsystem +
-     * Requires heavy use of memory and storage system +
-     * An I/O Bound application is usually processing large amounts of data +
-     * An often behaviour is to use CPU resources for making I/O requests and to enter a sleeping state +
-     * e.g. Database applications+
  
-Before going further with the CPU specific metrics and tools, here is a methodical approach which can guide you when tuning the performance of a system: +**Machine Learning**
-  ​Understand the factors which affect the performance +
-  ​Create a baseline measurement with the normal performance of the system +
-  ​Reproduce the issue and compare the measurements with the baseline to narrow down the bottleneck to a specific subsystem +
-  ​Try a single change at a time and test the results+
  
-==== 02. Introducing the CPU and CPU Metrics ====+  * sklearn: machine learning toolkit 
 +  * tensorflow: deep learning framework developed by google 
 +  * keras: deep learning framework on top of `tensorflow` for easier implementation 
 +  * pytorch: deep learning framework developed by facebook
  
-Before looking at the numerous performance measurement tools present in the Linux operating system, it is important to understand some key concepts and metrics, along with their interpretation regarding the performance of the system. 
  
-The kernel contains a scheduler which is in charge of scheduling two types of resources: interrupts and threads. The resources are assigned by the scheduler with different priorities. The following list presents the priorities:​ +**Statistics and data analysis**
-  ​User Processes - All the processes running in the user space - having the lowest priority in the scheduling mechanism +
-  ​System Processes - All kernel processing +
-  ​Interrupts - Devices announcing the kernel that they are done processing+
  
 +  * pandas: very popular data analysis library
 +  * statsmodels:​ statistics
  
-=== Context Switches ===+We also have advanced interactive environments:​
  
-While executing a process, the necessary set of data is stored ​in registers on the processor and cache. This group of information is called a context. Each thread owns an allotted time quantum to spend on the CPU, and when the time finishes or it is preempted by a higher priority task, a new ready to run process will be scheduled. When the next process is scheduled to run, the context of the current will be stored and the context of the new one is restored to the registers, this process being named context switch. Having a great volume of context switching is not desired because the CPU has to flush its register and cache each time, to gain room for the new process, which leads to performance issues.+  * IPython: advanced python console 
 +  * Jupyter: notebooks ​in the browser
  
-=== The Run Queue ===+There are many more scientific libraries available.
  
-Each CPU preserves its own run queue of threads. In an ideal scenario, the scheduler would be constantly executing threads. Threads can be in different states: runnable - processes which are ready to be executed or in a sleep state - being blocked while waiting for I/O. If the system has performance issues or it’s overloaded, then the queue starts to fill up and a process thread will take longer to execute. 
  
-The same concept is known also as “load”. This term is measured by load average, which is a rolling average of the sum of the processes waiting to be processed and the processes waiting ​for uninterruptible task to be completed. Unix systems traditionally present ​the CPU load as 1-minute, 5-minute and 15-minute averages.+Check out these cheetsheets ​for fast reference ​to the common libraries:
  
-=== CPU Utilisation ===+**Cheat sheets:**
  
-The CPU Utilisation is a meaningful metric to observe how the running processes make use of the given processing resourcesYou can find the following categories the vast majority of performance monitoring tools: +  - [[https://​perso.limsi.fr/​pointal/​_media/​python:cours:​mementopython3-english.pdf)|python]] 
-  * User time the time percentage a CPU spends on user processes +  - [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Numpy_Python_Cheat_Sheet.pdf|numpy]] 
-     * High user time values are recommended because this usually means that the system carries out actual work +  - [[https://s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Python_Matplotlib_Cheat_Sheet.pdf|matplotlib]] 
-  ​* System time the time percentage a CPU spends on kernel threads and interrupts +  - [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Scikit_Learn_Cheat_Sheet_Python.pdf|sklearn]] 
-     *  High system time values could mean bottlenecks in the network and driver stack +  - [[https://​github.com/​pandas-dev/​pandas/​blob/​master/​doc/​cheatsheet/​Pandas_Cheat_Sheet.pdf|pandas]]
-  * Waiting I/O the time percentage a CPU waits for a I/O event to occur +
-     * A system should not spend too much time waiting for I/O operations +
-  ​* Idle time the time percentage a CPU spends waiting for tasks +
-  ​* Nice time the time percentage spends on changing the priority and execution order of processesIt is often included in the user time+
  
 +**Other:**
  
-==== 03CPU Performance Monitoring ==== +  - [[https://​stanford.edu/​~shervine/​teaching/​cs-229/​refresher-probabilities-statistics|Probabilities & Stats Refresher]] 
-The Linux distributions have various monitoring tools available. Some of the utilities deal with metrics in a single tool, providing well formatted output which eases the understanding of the system performance. Other tools are specialized on more specific metrics and give us detailed information+  - [[https://​stanford.edu/​~shervine/​teaching/​cs-229/​refresher-algebra-calculus|Algebra]]
  
-Some of the most important Linux CPU performance monitoring tools: 
  
-^ Tool      ^ Most useful function ^  +<​note>​This lab is organized in a Jupyer Notebook hosted on Google Colab. You will find there some intuitions ​and applications for numpy and matplotlib. Check out the Tasks section below.</​note>​
-| vmstat ​   | System activity ​    | +
-| top    | Process activity ​    | +
-| uptime, w    | Average system load | +
-| ps, pstree ​   | Displays the processes ​    | +
-| iostat ​   | Average CPU load     | +
-| sar   | Collect ​and report system activity ​    | +
-| mpstat ​   | Multiprocessor usage     |+
  
 +===== Tasks =====
  
 +{{namespace>:​ep:​labs:​01:​contents:​tasks&​nofooter&​noeditbutton}}
  
-==== 04. Examples ==== 
  
-Understanding how well a CPU is performing is a matter of interpreting the run queue, its utilisation,​ and the amount of context switching performed. Although performance is relative to baseline statistics, in the absence of these statistics, the following general performance expectations of a system can be used as a guideline: 
-  * Run Queues – A run queue should not have more than 3 threads queued per processor. For example, a dual processor system should not have more than 6 threads in the run queue. 
-  * CPU Utilisation – A fully utilised CPU should have the following utilisation distribution:​ 
-     * 65% – 70% User Time 
-     * 30% – 35% System Time 
-     ​* ​ 0% –  5% Idle Time 
-  * Context Switches – The amount of context switches is directly relevant to CPU utilisation. As long as the CPU sustains the previously presented utilisation distribution,​ it is acceptable to have a high amount of context switches. 
  
-The following two examples give interpretations of the outputs generated by **vmstat**. 
- 
-=== Example A - Sustained CPU Utilisation === 
- 
-{{ :​ep:​laboratoare:​ep1_poz1.png?​550 |}} 
-  
-The following observations can be made based on this output: 
-  * There are a high amount of interrupts (**in**) and a low amount of context switches (**cs**). It appears that a single process is making requests to hardware devices. 
-  * To further prove the presence of a single application,​ the user (**us**) time is constantly at 85% and above. Along with the low amount of context switches, we deduce that the process comes on the processor and stays on the processor. 
-  * The run queue is just about at the limits of acceptable performance. On a couple occasions, it goes beyond acceptable limits. 
- 
-=== Example B - Overloaded Scheduler === 
- 
-{{ :​ep:​laboratoare:​ep1_poz2.png?​550 |}} 
-  
-The following observations can be made based on this output: 
-  * The amount of context switches is higher than interrupts, suggesting that the kernel has to spend a considerable amount of time context switching threads. 
-  * The high volume of context switches is causing an unhealthy balance of CPU utilisation. This is evident by the fact that the wait on IO percentage is extremely high and the user percentage is extremely low. 
-  * Because the CPU is blocked waiting for I/O, the run queue starts to fill and the amount of threads blocked waiting on I/O also fills. 
- 
-\\ 
-//These examples are from Darren Hoch’s [[http://​ufsdump.org/​papers/​oscon2009-linux-monitoring.pdf|Linux System and Performance Monitoring]].//​ 
- 
-===== Tasks ===== 
- 
-{{namespace>:​ep:​labs:​01:​contents:​tasks&​nofooter&​noeditbutton}} 
  
  
ep/labs/01.1633946160.txt.gz · Last modified: 2021/10/11 12:56 by radu.mantu
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0