Differences

This shows you the differences between two versions of the page.

Link to this comparison view

ep:labs:02 [2022/10/05 11:49]
radu.mantu [Proof of Work]
ep:labs:02 [2025/02/12 00:00] (current)
cezar.craciunoiu
Line 1: Line 1:
-====== Lab 02 - Memory Monitoring (Linux) ​======+====== Lab 02 - Advanced Plotting ​======
  
 ===== Objectives ===== ===== Objectives =====
  
-  * Offer an introduction ​to Virtual Memory. +  * Introduction ​to pandas 
-  * Get you acquainted ​with relevant commands and their outputs for monitoring memory related aspects. +  * Easy data manipulations ​with pandas 
-  * Introduce the concept ​of page de-duplication. +  * Introduction to seaborn 
-  * Present a step-by-step guide to Intel PIN for dynamic instrumentation. +  * More types of cool looking plots with seaborn 
-===== Contents =====+  * Apply what you learned on exploring COVID data for Romania
  
-{{page>:​ep:​labs:​02:​meta:​nav&​nofooter&​noeditbutton}} 
  
-===== Proof of Work =====+===== Resources ​=====
  
-Before you startcreate a [[http://​docs.google.com/​|Google Doc]]. Here, you will add screenshots / code snippets / comments ​for each exercise. Whatever you decide to includeit must prove that you managed ​to solve the given task (so don't show just the output, but how you obtained it and what conclusion can be drawn from it). If you decide to complete the feedback for bonus points, include a screenshot with the form submission confirmation,​ but not with its contents.+In this labwe will study the basic API of pandas ​for easier data manipulationsand seaborn for some more advanced and visually appealing plots that are also easy to produce
  
-When doneexport ​the document as a //pdf// and upload ​in the appropriate assignment on [[https://​curs.upb.ro/​2021/​course/​view.php?​id=5665#​section-3|moodle]]. The deadline is 23:55 on Friday.+For the exercisesyou will explore ​the evolution of the COVID pandemic ​in Romania, using the information learned in this lab
  
 +For scientific computing we need an environment that is easy to use, and provides a couple of tools like manipulating data and visualizing results. We will use Google Colab, which comes with a variety of useful tools already installed. ​
  
-===== Introduction =====+Check out these cheetsheets for fast reference to the common libraries:
  
-==== 01. Virtual Memory ====+**Cheat sheets:**
  
-Virtual memory uses a disk as an extension of RAM so that the effective size of usable memory grows correspondinglyThe kernel will write the contents of a currently unused block of memory to the hard disk so that the memory can be used for another purposeWhen the original contents are needed again, they are read back into memoryThis is all made completely transparent to the user; programs running under Linux only see the larger amount of memory available and don't notice that parts of them reside on the disk from time to time. Of course, reading and writing the hard disk is slower (on the order of a thousand times slowerthan using real memory, so the programs don't run as fastThe part of the hard disk that is used as virtual memory is called the swap space.+  - [[https://​perso.limsi.fr/​pointal/​_media/​python:​cours:​mementopython3-english.pdf)|python]] 
 +  - [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Numpy_Python_Cheat_Sheet.pdf|numpy]] 
 +  - [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Python_Matplotlib_Cheat_Sheet.pdf|matplotlib]] 
 +  - [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Scikit_Learn_Cheat_Sheet_Python.pdf|sklearn]] 
 +  - [[https://​github.com/​pandas-dev/​pandas/​blob/​master/​doc/​cheatsheet/​Pandas_Cheat_Sheet.pdf|pandas]] 
 +  - [[https://​s3.amazonaws.com/​assets.datacamp.com/​blog_assets/​Python_Seaborn_Cheat_Sheet.pdf|seaborn]]
  
-==== 02. Virtual Memory Pages ==== +<​note>​This lab is organized ​in a Jupyer Notebook hosted on Google ColabYou will find there some intuitions ​and applications for pandas ​and seabornCheck out the Tasks section ​below.</note>
- +
-Virtual memory ​is divided into pages. Each virtual memory page on the X86 architecture is 4KB. When the kernel writes memory to and from disk, it writes memory ​in pages. The kernel writes memory pages to both the swap device and the file system. +
- +
-==== 03. Kernel Memory Paging ==== +
- +
-Memory paging is normal activity not to be confused with memory swappingMemory paging is the process of syncing memory back to disk at normal intervals. Over time, applications ​will grow to consume all of memory. At some point, the kernel must scan memory ​and reclaim unused pages to be allocated to other applications+
- +
-==== 04. The Page Frame Reclaim Algorithm (PFRA) ==== +
- +
-The PFRA is responsible ​for freeing memory. The PFRA selects which memory pages to free by page type. Page types are listed below: +
-  * **Unreclaimable** – locked, kernel, reserved pages +
-  * **Swappable** – anonymous memory pages +
-  * **Syncable** – pages backed by a disk file +
-  * **Discardable** – static pages, discarded pages +
- +
-All but the “unreclaimable” pages may be reclaimed by the PFRA. There are two main functions in the PFRA. These include the kswapd kernel thread ​and the “Low On Memory Reclaiming” function. +
- +
-==== 05. Kswapd ==== +
- +
-The **kswapd** daemon is responsible for ensuring that memory stays free. It monitors ​the **pages_high** and **pages_low** watermarks in the kernel. If the amount of free memory is below **pages_low**,​ the **kswapd** process starts a scan to attempt to free 32 pages at a timeIt repeats this process until the amount of free memory is above the **pages_high** watermark.  +
- +
-The **kswapd** thread performs the following actions: +
-  * If the page __is unmodified__,​ it places the page on the free list. +
-  * If the page is__ modified and backed by a file system__, it writes the contents of the page to disk. +
-  * If the page __is modified and not backed up by any file system (anonymous)__,​ it writes the contents of the page to the swap device. +
- +
-==== 06. Kernel Paging with pdflush ==== +
- +
-  * The **pdflush** daemon is responsible for synchronizing any pages associated with a file on a filesystem back to disk. In other words, when a file is modified in memory, the **pdflush** daemon writes it back to disk. +
-  * The **pdflush** daemon starts synchronizing dirty pages back to the filesystem when 10% of the pages in memory are dirty. This is due to a kernel tuning parameter called **vm.dirty_background_ratio**. +
-  * The **pdflush** daemon works independently of the PFRA under most circumstances. When the kernel invokes the LMR (Low on Memory Reclaiming) algorithm, the LMR specifically forces **pdflush** to flush dirty pages in addition to other page freeing routines. +
-  * The **vmstat** utility reports on virtual memory usage in addition to CPU usage. The following fields in the **vmstat** output are relevant to virtual memory: **Swapd**, **Free**, **Buff**, **Cache**, **So**, **Si**, **Bo**, **Bi** (use //man vmstat// to read their description). +
- +
-The following **vmstat** output demonstrates heavy utilization of virtual memory during an I/O application spike. The following observations can be made based on this output: +
- +
-  * A large amount of disk blocks are paged in (//bi//) from the filesystem. This is evident in the fact that the cache of data in process address spaces (//cache//) grows. +
-  * During this period, the amount of free memory (//free//) remains steady at 17MB even though data is paging in from the disk to consume free RAM. +
-  * To maintain the free list, **kswapd** steals memory from the read/write buffers (//buff//) and assigns it to the free list. This is evident in the gradual decrease of the buffer cache (buff). +
-  * The **kswapd** process then writes dirty pages to the swap device (//so//). This is evident in the fact that the amount of virtual memory utilized gradually increases (//​swpd//​). +
- +
-Conclusions:​ +
-  * The less major page faults on a system, the better response times achieved as the system is leveraging memory caches over disk caches. +
-  * Low amounts of free memory are a good sign that caches are effectively used unless there are sustained writes to the swap device and disk. +
-  * If a system reports any sustained activity on the swap device, it means there is a memory shortage on the system.+
  
 ===== Tasks ===== ===== Tasks =====
ep/labs/02.1664959795.txt.gz · Last modified: 2022/10/05 11:49 by radu.mantu
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0