Differences

This shows you the differences between two versions of the page.

Link to this comparison view

ep:teme:01 [2022/11/16 21:18]
vlad.stefanescu [I. (10p) Prerequisites]
ep:teme:01 [2026/03/04 14:35] (current)
radu.mantu [Memory access tracing]
Line 1: Line 1:
 ====== Assignment ====== ====== Assignment ======
  
-===== 1. Context ​=====+===== 01. Overview ​=====
  
-In the last few years, the number ​of Internet users has seen unprecedented growth. Whether these users are human beings or machines (IoT devices, bots, other services, mobile clients etc.) they place great burden ​on the systems they are requesting services fromAs a consequence,​ the administrators of those systems had to adapt and come up with solutions for efficiently handling the increasing trafficIn this assignment, we will focus on one of them and that is **load balancing**.+The goal of this assignment is to implement ​tool based on [[https://​man.archlinux.org/​man/​perf_event_open.2.en|Linux Perf Events]] ​that is able to monitor main memory accesses performed by another process.
  
-A **load balancer** is a proxy meant to distribute traffic across a number of servers, usually called ​**services**. By adding a load balancer in a distributed systemthe **capacity** and **reliability** of those services significantly increaseThat is why in almost every modern **cloud architecture** there is at least one layer of load balancing.+For this assignment you will be allowed ​to **work in pairs**. Alsoyou will need to have an **Intel CPU** capable of recording ​**MEM_INST_RETIRED** eventsAnything newer than Nehalem should do.
  
-===== 2. Architecture ​=====+===== 02. Requirements ​=====
  
-We propose a topology that should mimic a **cloud system** that acts as a global service, replicated across multiple **machines/​workers** and **regions**,​ which must serve the clients as efficiently as possible. Supposing that all the requests are coming from users found in the same country, the **latencies** expected from the cloud regions differ. Moreover, the **number of machines** available on each region vary and all these aspects can have an impact on the overall performance of the system.+==== Partner up ====
  
-That is why, dedicated proxy that decides where to route those requests coming from the clients had to be added into the system. Obviously, ​this proxy is our **load balancer** ​and in this particular scenario it is divided into **2 main components**:+Select ​partner for this assignment ​and submit your choice via [[https://​forms.gle/​unnN3f8pksSbg85g9|this form]]. \\ 
 +If you can't find a partner, try advertising on the [[https://​curs.upb.ro/​2025/​mod/​forum/​discuss.php?​d=3902|assignment forum]].
  
-  - A few **//​routers//​**,​ meant to **forward the requests** coming from the clients to an actual machine available on a certain region +<note important>​ 
-  - A **//command unit (c1)//** that is supposed ​to **identify ​the number ​of requests** that are about to hit our system and decide, based on this number, how to **efficiently utilize ​the routing units**+Only one student ​is required ​to complete ​the form on behalf ​of the team.\\ 
- +Only one student (not necessarily ​the same) will have to upload ​the assignment on moodle.\\ 
-You can have an overview on the proposed architecture by looking at the diagram below: +You are **required** to work with a partner on this assignment.
- +
-/*{{ :​ep:​teme:​load_balancer_architecture.jpg?​600 |}}*/ +
-{{ :​ep:​teme:​topologie_mininet_tema.png?​600 |}} +
- +
-In this assignment, you will be focusing on doing a **//​topology performance analysis//​** and on the **//Command Unit logic//**, the other components of the system being already implemented. +
- +
-===== 3. Mininet Topology ===== +
- +
-Mininet is a network emulator which creates a network of virtual hosts, switches, controllers,​ and links. Mininet hosts run standard Linux network software, and its switches support OpenFlow for highly flexible custom routing and Software-Defined Networking. +
- +
-The topology above was built using Mininet. In this manner, you will have to use the API that commands the servers and the client/​command unitThe topology has three layers: +
- +
-  * First  layer - the network between c1 and the first router r0 - 10.10.200.0/​24 +
-  * Second layer - the networks between r0 and the region routers r1,r2,r3 - 10.10.x.0/​24 +
-  * Third  layer - the networks between region routers and the web servers - 10.10.10x.0/​24 ​  +
- +
-<note tip>The region names are there only for classification purposes and the '**x**'s in the IPs are replaced ​with the branch numbers, as depicted in the diagram above.</​note>​ +
- +
-===== 4. Environment ===== +
- +
-To make it easier for everyone and fail-proof, we'll be using the official Mininet latest release VM, which you can get from [[https://​github.com/​mininet/​mininet/​releases/​download/​2.3.0/​mininet-2.3.0-210211-ubuntu-20.04.1-legacy-server-amd64-ovf.zip|here]].  +
- +
-<note Useful links> +
-http://​mininet.org/​download/​ \\ +
-https://​github.com/​mininet/​mininet/​releases/​ \\ +
-https://​github.com/​mininet/​mininet/​releases/​download/​2.3.0/​mininet-2.3.0-210211-ubuntu-20.04.1-legacy-server-amd64-ovf.zip+
 </​note>​ </​note>​
  
-===== 5. Implementation =====+==== Usage ====
  
-First of all, you have to deploy ​the topology and measure its performance ​under different ​test cases and collect data to make an idea what are the limits ​of it.+Your application should be implemented in C/C++ and take as positional arguments the commandline invocation ​of the program ​under test. For example, ''​./​my_tracer curl http://​example.com''​ will launch the tracer program that will then fork() & exec() **curl** ​and start monitoring its memory transactions at the same time. In case you need to add flags to your application,​ you can separate them from the commandline ​of the child process with ''​%%--%%''​.
  
-Secondly, you have to work on the **//command unit//** by editing the client to implement some optimisations on the traffic flow of the topology. All components of the topology are written in **Python 3**. Having a number of requests **N** as input, try various strategies of calling the 3 regions servers available so that your clients experiment response times as low as possible. There are **no constraints** applied to how you read the number of requests, what Python library you use to call the forwarding unit or how you plot the results.+==== Memory access tracing ====
  
-<note important>​This assignment ​**must** be developed in **Python 3**!</note>+Once the child process is up and running, you will have to monitor the **read** and **write** operations //​separately//​. Specifically,​ you will have to determine **what address has been accessed** and **what instruction performed this access**. This can be achieved using [[https://​www.intel.com/​content/​www/​us/​en/​developer/​articles/​technical/​timed-process-event-based-sampling-tpebs.html|Intel Processor Event Based Sampling (PEBS)]], a mode of operation that will write detailed sample information in a physical memory ring buffer whenever the event counter triggers. You will not be required to interact with this system directly, but instead utilize the [[https://​man.archlinux.org/​man/perf_event_open.2.en#​MMAP_layout|sampled mode]] of Linux Perf Events.
  
-Because we are working with HTTP requests, the client in its current state is able to make a single request and print the result in a file (as you need to exclusively monitor the client in order to get outputs from it). You can build on that and modify it as you please to fit your needs.+==== Mapping addresses ​to objects ====
  
-Howeverwe strongly suggest ​you work in a **virtual environment** where you install all your **pip dependencies**. By doing so, you will keep your global workspace free of useless packages and you can easily specify just those packages required to run your code: +Once this task is complete, your next objective is to map both the accessed address and the instruction'​s address to a memory mapped object (where appropriate). For instance, you will have to be able to distinguish between ​memory access performed by code belonging to **libc** or **libz**. Additionally, you must identify whether the accessed memory address belongs to a data segment ​of a memory mapped object, or the heap / stack insteadTo solve this task, know that the Linux Perf system can generate more than PMC Event Records while in sampled mode. In fact, the kernel can be configured to report any **mmap()** that the program under test performsThis is how **perf record** can embed object information into the sample file in order for **perf report** to subsequently translate those samples into //"​hot"​// functions, even with ASLR enabled.
- +
-<​code>​pip freeze > requirements.txt</​code>​ +
- +
-Please note that we will definitely apply penalties if the **requirements.txt** file contains packages ​that are not usedAlways imagine that you are in a real production environment :-). You can find out more about **virtual environments** [[https://​docs.python.org/​3/​tutorial/​venv.html|here]]. +
- +
-===== 6. Objectives and Evaluation ===== +
- +
-<note important>​All ​the necessary files required ​for the prerequisites can be found and cloned from {{https://github.com/alexmircea98/temaEP|here}}.</​note>​ +
- +
-==== I. (10p) Prerequisites ==== +
- +
-=== A. (3p) Mininet machine === +
- +
-Download and import the mininet machine. Its credentials are: +
- +
-Username: mininet +
- +
-Password: mininet +
- +
-=== B. (7p) Run the topology === +
- +
-Clone the repo from above and check it by running:+
  
 <​note>​ <​note>​
-<code bash> +It is possible for memory accesses to be performed by instructions located in non-file backed regionsFor exampleJIT-ed JavaScript code generated by **V8** for Chromium ​and **SpiderMoneky** for Firefoxor **LuaJit** for Neovim plugins or World of Warcraft addons.
-$ sudo python3 topology.py ​-+
-usage: topology.py [-h] [-t] user +
- +
-positional arguments:​ +
-  user        your moodle username +
- +
-optional arguments:​ +
-  -h, --help  show this help message ​and exit +
-  -t--test ​ set it if you want to run tests +
-</​code>​+
 </​note>​ </​note>​
  
-You will find that there are 2 arguments first is your moodle username, and the second one is an optional flag to run a test. What the test flag actually does is call the function inside test.py. Because we wanted to keep the topology link metrics hidden we had to make the topology a .pyc and give you the test as a means to create automated tests for the topo. +==== Plotting ====
  
-<​note>​ +The final implementation task is to create a **dynamic** visualization interface that can show the amount of both memory reads and writes performed live, as well as the locations being accessed and the objects performing themNote that you must provide a **fine-grained view** of each object. For example, if you decide to implement this feature as a histogram, you will have to create //​multiple//​ buckets for each object. So if you create a micro-benchmark that follows a linear memory access pattern in heap, your visualization tool must show how each bucket representing the heap region gets filled, one by one.
-<code bash> +
-Exemplu de rulare direct cu cli: +
-<code bash> +
-$ sudo python3 topology.py sandu.popescu+
  
-[...+<note tip> 
-No test runstarting cli +You are free to implement this feature in any way you desireE.g., you can pass the data to be plotted to a Python3 script that generates a [[https://matplotlib.org/​stable/​users/​explain/​figure/​interactive.html|matplotlib interactive figure]]. Or you can generate an in-process frontend using [[https://github.com/​ocornut/​imgui|ImGui]] or [[https://​www.man7.org/​linux/​man-pages/​man3/​ncurses.3x.html|ncurses]]. Or you can write an HTTP server that can accept state updates over the network and display the plots in your browser. These are just a few ideas; feel free to utilize whatever you're most comfortable with. 
-*** Starting CLI: +</​note>​
-containernet>​ +
-</code> +
-Exemplu de rulare cu apel pe functia test din test.py si apoi cli: +
-<code bash> +
-$ sudo python3 topology.py sandu.popescu ​-t+
  
-[...] +<note important>​ 
-Running base test with only one server +Small bonus available if you can limit the displayed samples to a user-specified time windowIn other words, show the memory access distribution for the past **N** seconds while continuously updating the plot. Whether a sample is part of the window or not should be decided based on the time it was taken, not when you consumed it from the record ring buffer. Perf also has an option for attaching a timestamp to each record.
-Done +
-*** Starting CLI: +
-stopping h1 +
-containernet>​ +
-</​code>​+
 </​note>​ </​note>​
  
 +==== Documentation ====
  
- +Implementation aside, ​your last task is to test and document ​your project. Your documentation ​should be in PDF format ​and describe ​your design choiceswhat tasks you found most difficulthow you solved those problemsand how you tested your tracerNaturallythis implies ​you adding plots generated after tracing //​multiple//​ benchmark programsExplain how you chose these benchmarks ​and what observations you could make.
- +
-<note important>​Make sure you don't misspell ​your username. :​-)</​note>​ +
- +
-**The topology script will:** +
- +
-  - Create routers, switches and hosts +
-  - Add links between each node with custom metrics  +
-  - Add routing rules +
-  - Run the test if available +
-  - And then connect the CLI +
- +
- +
-Inside the test script there is an example of usage of the api to run commands on the hosts machines in an automated manner. +
- +
-Alternatively you can run commands from any node, specifying the node and then the command. +
-For example if you want to ping r1 router from c1 host you can run the following:​ +
- +
-<code bash> +
-No test run, starting cli +
-*** Starting CLI: +
-containernet>​ c1 ping r1 +
-</​code>​ +
- +
-==== (30p) II. Evaluation - System Limits Analysis ==== +
- +
-Before implementing ​your own solutions to make traffic more efficient, you should first analyze the **limits of the system**You should find out the answer to questions such as the following:​ +
- +
-  * How many requests can be handled by a single machine? +
-  * What is the latency of each region? +
-  * What is the server path with the smallest response time? But the slowest? +
-  * What is the path that has the greatest loss percentage?​ +
-  * What is the latency introduced by the **//first router//** in our path? +
-  * Is there any bottleneck in the topology? How would you solve this issue? +
-  * What is your estimation regarding the latency introduced?​ +
-  * What downsides do you see in the current architecture design? +
- +
-Your observations ​should be written ​in the **Performance Evaluation Report** accompanied by **relevant charts** (if applicable). +
- +
-==== (50p) III. Implementation ==== +
- +
-=== (30p) A. Solution === +
- +
-Find methods to optimize traffic. You have to come with 3-5 methods to optimize it and test them on the **//command unit/​client//​**(you can write them as part of the client). Your solution should try various ways of calling the exposed endpoints of the topology depending on the number of requests ​your system must serve. For instanceif you only have 10 requests, you might get away by just calling a certain endpointbut if this number increases, then you might want to try something more complex. +
- +
-The number of requests your system should serve is not imposedbut you should definitely try a sufficiently large range of request batches in order to properly evaluate your policiesChoosing a relevant number is part of the task. :-) +
- +
-<note important>​You should have **at least 3** optimization methods!</​note>​ +
- +
-== Response Object == +
- +
-Since the hosts are running an HTTP server script on the servers, ​you should expect HTTP responses or adapt your client for this type of traffic. +
- +
-=== (20p) B. Efficient Policies Comparison === +
- +
-**Compare** your **efficient policies** for a relevant range of request batch sizes and write your observations ​in the **Performance Evaluation Report** file together with some **relevant charts** +
- +
-==== (10p) IV. Documentation ==== +
- +
-You should write a high quality **Performance Evaluation Report** document which: +
- +
-  * should explain your **implementation** and **evaluation** strategies +
-  * present the **results** +
-  * can have a **maximum of 3 pages** +
-  * should be readable, easy to understand and aesthetic +
-  * on the **first page** it should contain the following:  +
-     * your name +
-     * your group number +
-     * which parts of the assignment were completed +
-     * what grade do you consider that your assignment should receive +
- +
-=== (10p) Bonus === +
- +
-**Deploy** your solution in a **Docker image** and make sure it can be run with a **runtime argument** representing the number of requests your system should serve. The container you created should be able to communicate with the **Forwarding Unit**. +
- +
-===== 7Assignment Upload =====+
  
 <note tip> <note tip>
-The **solution archive** (.zip) should only contain:+The goal of this documentation is to convince the reader of the soundness of your design and implementation. Try to pose and answer questions such as //What guarantee do we have that the sampling is uniform? Is it possible to have a burst of localized samples followed by a period of PMC inactivity?//​ or //How did we verify that both read and write accesses have been reported, and not just one type?//.
  
-  * the **Python modules** used in the implementation (mainly test.py and client.py or whatever other source files you used) +Such issues will arise naturally as you implement ​the assignment so don't give them much thought beforehand. But remember to address them in the endAlso, needless to say, don't limit yourselves to these examples
-  * a **requirements.txt** file to easily install all the necessary **pip dependencies** +</​note> ​
-  * a **Performance Evaluation Report** in the form of a **PDF** file+
  
-</​note>​+===== Grading =====
  
-The assignment ​has to be uploaded ​**[[https://​curs.upb.ro/​2021/​mod/​assign/​view.php?​id=85256|here]]** by **23:55 on December 12th 2022**.  +The deadline for this assignment ​is **11 May**. Upload a **zip archive** containing the source code, Makefile, documentation and any **micro-benchmarks** used in testing (don't go and include **redis** in your submission). The archive should be uploaded to this [[https://​curs.upb.ro/​2025/​mod/​assign/​view.php?​id=135330|moodle assignment]].
-This is a **HARD deadline**.+
  
-<note important>​ +This assignment is worth **1.5p** of your final grade. The breakdown by task is as follows: 
-Questions regarding ​the assignment ​should be addressed ​**[[https://curs.upb.ro/2022/mod/forum/discuss.php?​d=1627|here]]**. +  * **Memory access tracing (30%):** If nothing else, the application can provably monitor memory accesses by printing the relevant information to //​stdout//​. 
-</note>+  * **Mapping addresses to objects (30%):** The application ​should be able to generate statistics for both accessed data regions and code regions performing the accesses. Reads and writes must be treated separately. 
 +  ​* **Plotting (10%):** Live illustration of the statistics mentioned in the previous taskBe creative and include even more data if you can. 
 +  * **Documentation (30%):** Adequately explains the design and implementation. Can convincingly prove that both are sound. Describes the testing methodology and presents the results in a //concise// but thorough mannerIn other words: //"​Someone has to read this so be considerate and don't waste their timeImproves your chances of not pissing them off."//
  
 <note important>​ <note important>​
-If the submission does not include the Report / a Readme file, the assignment ​will be graded with ZERO! +The **first pair** that submits an assignment ​that receives ​**full marks** will automatically pass the exam with maximum grade.
-</​note>​ +
-<note important>​ +
-To emphasise this, we are writing it again in bold: +
- +
-**If the submission does not include the Report / Readme file, the assignment will be graded with ZERO!**+
 </​note>​ </​note>​
  
 +===== FAQ =====
  
 +:?:
ep/teme/01.1668626297.txt.gz · Last modified: 2022/11/16 21:18 by vlad.stefanescu
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0