Differences

This shows you the differences between two versions of the page.

Link to this comparison view

ep:labs:01 [2021/10/11 13:05]
radu.mantu [Proof of Work]
ep:labs:01 [2025/02/12 00:00] (current)
cezar.craciunoiu
Line 1: Line 1:
-~~NOTOC~~ +====== Lab 01 - Plotting ​======
- +
-====== Lab 01 - CPU Monitoring (Linux) ​======+
  
 ===== Objectives ===== ===== Objectives =====
  
-  * Offer an introduction to Performance Monitoring +  * Offer an introduction to Numpy & matplotlib 
-  * Present the main CPU metrics and how to interpret them +  * Get you familiarised with the numpy API 
-  * Get you to use various tools for monitoring ​the performance of the CPU +  * Understand basic plotting ​with matplotlib
-  * Familiarize you with the x86 Hardware Performance Counters+
  
-===== Contents ===== 
  
-{{page>:​ep:​labs:​01:​meta:​nav&​nofooter&​noeditbutton}} 
  
-===== Proof of Work ===== 
  
-Before you start, create a [[http://​docs.google.com/​|Google Doc]]. Here, you will add screenshots / code snippets / comments for each exercise. Whatever you decide to include, it must prove that you managed to solve the given task (so don't show just the output, but how you obtained it and what conclusion can be drawn from it). If you decide to complete the feedback for bonus points, include a screenshot with the form submission confirmation,​ but not with its contents. 
  
-When done, export the document as a //pdf// and upload in the appropriate assignment on [[https://​curs.upb.ro/​2021/​course/​view.php?​id=5665#​section-2|moodle]].+===== Python Scientific Computing Resources =====
  
-NOTE: this being the first lab, you have time until 11:55pm to upload your workStarting next week, the cut-off time will be 15m after the lab ends. +In this lab, we will study a new library in python that offers fast, memory efficient manipulation of vectors, matrices and tensors**numpy**We will also study basic plotting of data using the most popular data visualization libraries in the python ecosystem: **matplotlib**
-===== Introduction =====+
  
-==== 01. Performance Monitoring ==== +For scientific computing we need an environment that is easy to use, and provides ​couple ​of tools like manipulating data and visualizing results. 
-Performance monitoring ​is the process of regularly checking a set of metrics and tracking the overall health of a specific system. Monitoring is tightly coupled with performance tuning, and a Linux system administrator should be proficient in these two subjects, as one of their main responsibilities is to identify bottlenecks ​and find solutions to help the operating system surpass themPinpointing a Linux system bottleneck requires a deep understanding of how various components of this operating system work (e.g. how processes are scheduled on the CPU, how memory ​is managed ​, the way that I/O interrupts are handled, the details of network layer implementation,​ etc)From a high levelthe main subsystems that you should think of when tuning are CPU, Memory, I/O and Network.+Python ​is very easy to usebut the downside is that it's not fast at numerical computingLuckilywe have very eficient libraries for all our use-cases.
  
-These four subsystems are vastly depending on each other and tuning the whole system implies keeping them in harmony. To quote a famous idiom, “a chain is no stronger than its weakest link”. Thus, when investigating a system performance issue, all the subsystems must be checked and analysed.+**Core computing libraries**
  
-Being able to discover the bottleneck in a system requires also understanding of what types of processes are running on it. The application stack of a system can be broken down in two categories:+  * numpy and scipyscientific computing 
 +  * matplotlib: plotting library
  
-  ​CPU Bound - performance is limited by the CPU +**Machine Learning**
-     Requires heavy use of the CPU (e.g. for batch processing, mathematical operations, etc) +
-     e.g. High volume web servers +
-  ​I/O Bound - performance is limited by the I/O subsystem +
-     * Requires heavy use of memory and storage system +
-     * An I/O Bound application is usually processing large amounts of data +
-     * An often behaviour is to use CPU resources for making I/O requests and to enter a sleeping state +
-     * e.g. Database applications+
  
-Before going further with the CPU specific metrics and tools, here is a methodical approach which can guide you when tuning the performance of a system+  * sklearnmachine learning toolkit 
-  * Understand the factors which affect the performance +  * tensorflow: deep learning framework developed by google 
-  * Create a baseline measurement with the normal performance ​of the system +  * keras: deep learning framework on top of `tensorflow` for easier implementation 
-  * Reproduce the issue and compare the measurements with the baseline to narrow down the bottleneck to a specific subsystem +  * pytorch: deep learning framework developed by facebook
-  * Try a single change at a time and test the results+
  
-==== 02. Introducing the CPU and CPU Metrics ==== 
  
-Before looking at the numerous performance measurement tools present in the Linux operating system, it is important to understand some key concepts ​and metrics, along with their interpretation regarding the performance of the system.+**Statistics ​and data analysis**
  
-The kernel contains a scheduler which is in charge of scheduling two types of resources: interrupts and threads. The resources are assigned by the scheduler with different priorities. The following list presents the priorities+  * pandasvery popular data analysis library 
-  * User Processes - All the processes running in the user space - having the lowest priority in the scheduling mechanism +  * statsmodels:​ statistics
-  * System Processes - All kernel processing +
-  * Interrupts - Devices announcing the kernel that they are done processing+
  
 +We also have advanced interactive environments:​
  
-=== Context Switches ===+  * IPython: advanced python console 
 +  * Jupyter: notebooks in the browser
  
-While executing a process, the necessary set of data is stored in registers on the processor and cache. This group of information is called a context. Each thread owns an allotted time quantum to spend on the CPU, and when the time finishes or it is preempted by a higher priority task, a new ready to run process will be scheduled. When the next process is scheduled to run, the context of the current will be stored and the context of the new one is restored to the registers, this process being named context switch. Having a great volume of context switching is not desired because the CPU has to flush its register and cache each time, to gain room for the new process, which leads to performance issues.+There are many more scientific libraries available.
  
-=== The Run Queue === 
  
-Each CPU preserves its own run queue of threads. In an ideal scenario, ​the scheduler would be constantly executing threads. Threads can be in different statesrunnable - processes which are ready to be executed or in a sleep state - being blocked while waiting for I/O. If the system has performance issues or it’s overloaded, then the queue starts to fill up and a process thread will take longer to execute.+Check out these cheetsheets for fast reference to the common libraries:
  
-The same concept is known also as “load”. This term is measured by load average, which is a rolling average of the sum of the processes waiting to be processed and the processes waiting for uninterruptible task to be completed. Unix systems traditionally present the CPU load as 1-minute, 5-minute and 15-minute averages.+**Cheat sheets:**
  
-=== CPU Utilisation ===+  - [[https://​perso.limsi.fr/​pointal/​_media/​python:​cours:​mementopython3-english.pdf)|python]] 
 +  - [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Numpy_Python_Cheat_Sheet.pdf|numpy]] 
 +  - [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Python_Matplotlib_Cheat_Sheet.pdf|matplotlib]] 
 +  - [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Scikit_Learn_Cheat_Sheet_Python.pdf|sklearn]] 
 +  - [[https://​github.com/​pandas-dev/​pandas/​blob/​master/​doc/​cheatsheet/​Pandas_Cheat_Sheet.pdf|pandas]]
  
-The CPU Utilisation is a meaningful metric to observe how the running processes make use of the given processing resources. You can find the following categories the vast majority of performance monitoring tools: +**Other:**
-  ​User time - the time percentage a CPU spends on user processes +
-     High user time values are recommended because this usually means that the system carries out actual work +
-  * System time - the time percentage a CPU spends on kernel threads and interrupts +
-     ​* ​ High system time values could mean bottlenecks in the network and driver stack +
-  * Waiting I/O - the time percentage a CPU waits for a I/O event to occur +
-     * A system should not spend too much time waiting for I/O operations +
-  ​Idle time - the time percentage a CPU spends waiting for tasks +
-  ​Nice time - the time percentage spends on changing the priority and execution order of processes. It is often included in the user time+
  
 +  - [[https://​stanford.edu/​~shervine/​teaching/​cs-229/​refresher-probabilities-statistics|Probabilities & Stats Refresher]]
 +  - [[https://​stanford.edu/​~shervine/​teaching/​cs-229/​refresher-algebra-calculus|Algebra]]
  
-==== 03. CPU Performance Monitoring ==== 
-The Linux distributions have various monitoring tools available. Some of the utilities deal with metrics in a single tool, providing well formatted output which eases the understanding of the system performance. Other tools are specialized on more specific metrics and give us detailed information. ​ 
  
-Some of the most important Linux CPU performance monitoring tools:+<​note>​This lab is organized in a Jupyer Notebook hosted on Google Colab. You will find there some intuitions and applications for numpy and matplotlib. Check out the Tasks section below.</​note>​
  
-^ Tool      ^ Most useful function ^  +===== Tasks =====
-| vmstat ​   | System activity ​    | +
-| top    | Process activity ​    | +
-| uptime, w    | Average system load | +
-| ps, pstree ​   | Displays the processes ​    | +
-| iostat ​   | Average CPU load     | +
-| sar   | Collect and report system activity ​    | +
-| mpstat ​   | Multiprocessor usage     |+
  
 +{{namespace>:​ep:​labs:​01:​contents:​tasks&​nofooter&​noeditbutton}}
  
  
-==== 04. Examples ==== 
  
-Understanding how well a CPU is performing is a matter of interpreting the run queue, its utilisation,​ and the amount of context switching performed. Although performance is relative to baseline statistics, in the absence of these statistics, the following general performance expectations of a system can be used as a guideline: 
-  * Run Queues – A run queue should not have more than 3 threads queued per processor. For example, a dual processor system should not have more than 6 threads in the run queue. 
-  * CPU Utilisation – A fully utilised CPU should have the following utilisation distribution:​ 
-     * 65% – 70% User Time 
-     * 30% – 35% System Time 
-     ​* ​ 0% –  5% Idle Time 
-  * Context Switches – The amount of context switches is directly relevant to CPU utilisation. As long as the CPU sustains the previously presented utilisation distribution,​ it is acceptable to have a high amount of context switches. 
- 
-The following two examples give interpretations of the outputs generated by **vmstat**. 
- 
-=== Example A - Sustained CPU Utilisation === 
- 
-{{ :​ep:​laboratoare:​ep1_poz1.png?​550 |}} 
-  
-The following observations can be made based on this output: 
-  * There are a high amount of interrupts (**in**) and a low amount of context switches (**cs**). It appears that a single process is making requests to hardware devices. 
-  * To further prove the presence of a single application,​ the user (**us**) time is constantly at 85% and above. Along with the low amount of context switches, we deduce that the process comes on the processor and stays on the processor. 
-  * The run queue is just about at the limits of acceptable performance. On a couple occasions, it goes beyond acceptable limits. 
- 
-=== Example B - Overloaded Scheduler === 
- 
-{{ :​ep:​laboratoare:​ep1_poz2.png?​550 |}} 
-  
-The following observations can be made based on this output: 
-  * The amount of context switches is higher than interrupts, suggesting that the kernel has to spend a considerable amount of time context switching threads. 
-  * The high volume of context switches is causing an unhealthy balance of CPU utilisation. This is evident by the fact that the wait on IO percentage is extremely high and the user percentage is extremely low. 
-  * Because the CPU is blocked waiting for I/O, the run queue starts to fill and the amount of threads blocked waiting on I/O also fills. 
- 
-\\ 
-//These examples are from Darren Hoch’s [[http://​ufsdump.org/​papers/​oscon2009-linux-monitoring.pdf|Linux System and Performance Monitoring]].//​ 
- 
-===== Tasks ===== 
- 
-{{namespace>:​ep:​labs:​01:​contents:​tasks&​nofooter&​noeditbutton}} 
  
  
ep/labs/01.1633946754.txt.gz · Last modified: 2021/10/11 13:05 by radu.mantu
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0