The goal of this assignment is to implement a tool based on Linux Perf Events that is able to monitor main memory accesses performed by another process.
For this assignment you will be allowed to work in pairs. Also, you will need to have an Intel CPU capable of recording MEM_INST_RETIRED events. Anything newer than Nehalem should do.
Select a partner for this assignment and submit your choice via this form.
If you can't find a partner, try advertising on the assignment forum.
Your application should be implemented in C/C++ and take as positional arguments the commandline invocation of the program under test. For example, ./my_tracer curl http://example.com will launch the tracer program that will then fork() & exec() curl and start monitoring its memory transactions at the same time. In case you need to add flags to your application, you can separate them from the commandline of the child process with --.
Once the child process is up and running, you will have to monitor the read and write operations separately. Specifically, you will have to determine what address has been accessed and what instruction performed this access. This can be achieved using Intel Processor Event Based Sampling (PEBS), a mode of operation that will write detailed sample information in a physical memory ring buffer whenever the event counter triggers. You will not be required to interact with this system directly, but instead utilize the sampled mode of Linux Perf Events.
Once this task is complete, your next objective is to map both the accessed address and the instruction's address to a memory mapped object (where appropriate). For instance, you will have to be able to distinguish between a memory access performed by code belonging to libc or libz. Additionally, you must identify whether the accessed memory address belongs to a data segment of a memory mapped object, or the heap / stack instead. To solve this task, know that the Linux Perf system can generate more than PMC Event Records while in sampled mode. In fact, the kernel can be configured to report any mmap() that the program under test performs. This is how perf record can embed object information into the sample file in order for perf report to subsequently translate those samples into “hot” functions, even with ASLR enabled.
The final implementation task is to create a dynamic visualization interface that can show the amount of both memory reads and writes performed live, as well as the locations being accessed and the objects performing them. Note that you must provide a fine-grained view of each object. For example, if you decide to implement this feature as a histogram, you will have to create multiple buckets for each object. So if you create a micro-benchmark that follows a linear memory access pattern in heap, your visualization tool must show how each bucket representing the heap region gets filled, one by one.
Implementation aside, your last task is to test and document your project. Your documentation should be in PDF format and describe your design choices, what tasks you found most difficult, how you solved those problems, and how you tested your tracer. Naturally, this implies you adding plots generated after tracing multiple benchmark programs. Explain how you chose these benchmarks and what observations you could make.
Such issues will arise naturally as you implement the assignment so don't give them much thought beforehand. But remember to address them in the end. Also, needless to say, don't limit yourselves to these examples.
The deadline for this assignment is 11 May. Upload a zip archive containing the source code, Makefile, documentation and any micro-benchmarks used in testing (don't go and include redis in your submission). The archive should be uploaded to this moodle assignment.
This assignment is worth 1.5p of your final grade. The breakdown by task is as follows: