Normally, static instrumentation has a steep learning curve and requires some prior knowledge of how compilers work. In this exercise we'll take a simpler approach. Instead of performing fine-grained instrumentation on a program's Abstract Syntax Tree (AST), we'll use a built-in gcc mechanism to only instrument functions on entry and exit.
By specifying the -finstrument-functions flag when compiling a certain source file, gcc will try to register hooks for every externally visible (i.e., not static) function in that Compilation Unit (CU). As a result, the following callbacks will be invoked before entering, or exiting each function, respectively. Definitions taken straight from the online documentation.
void __cyg_profile_func_enter (void *this_fn, void *call_site); void __cyg_profile_func_exit (void *this_fn, void *call_site);
If these functions are not defined or made available through a shared object, the program execution will not be impacted at all. If they are defined within the same CU where the instrumented functions are implemented, note that you should attach the no_instrument_function
attribute to avoid infinite loops. If your program is written in C++ instead of C, make sure that you declare them as extern “C”
. Otherwise, the symbol mangling of C++ will create ambiguities between the default declarations that the compiler will inject into the AST when encountering the instrumentation flag.
If you wish to learn more about static instrumentation we recommend this blog post that is one of the few good resources available for gcc. Although llvm is more popular for developing instrumentation / optimization passes, their API is a clusterfuck that changes every few months. gcc is more stable and does not require you to recompile the whole compiler in order to use them transparently. In contrast, the current pass manager in llvm can only load passes implemented as shared object plugins via opt (i.e., the llvm optimizer) which only works on llvm bitcode.
In the code skeleton for this lab, you will find an example application that performs an HTTP GET request and displays the response. With this application, we've also included an example implementation of instrumentation callbacks. Try to compile both of these and run the instrumented TCP client.
$ make $ export LD_LIBRARY_PATH=$(realpath bin) $ ./bin/http-get 23.192.228.80 80
First of all, notice that we've exported LD_LIBRARY_PATH. The reason for this is that the instrumentation callbacks reside in a shared object that is linked to the main executable. At runtime, ld-linux (i.e., the dynamic loader) will need a search path for it.
Also, note that we've hardcoded example.com
as the Host header in the HTTP GET query, so make sure to provide a valid IP address. If you want to, you can patch the application source to make it more generic :p
After running the application, you should see something along these lines:
[*] src/ins/tool.cpp:35 Enter: (null) --> main [*] src/ins/tool.cpp:35 Enter: main --> tcp_connect(char*, unsigned short) [*] src/ins/tool.cpp:44 Exit : tcp_connect(char*, unsigned short) --> main [*] src/ins/tool.cpp:35 Enter: main --> send_query(int, char const*, unsigned long) [*] src/ins/tool.cpp:44 Exit : send_query(int, char const*, unsigned long) --> main
In our instrumentation callbacks, we use dladdr() to determine the symbol name of the function containing the call site / call target. This only works because we've specified the -rdynamic flag in the Makefile. This in turn passes -export-dynamic to the linker, telling it to add all symbols to the dynamic symbol table. Normally this helps when trying to obtain a backtrace from within a program. In our case, it allows to identify the functions involved in a call-based transition. Additionally, notice how some functions also contain the argument list. The reason for this is that we've also demangled the C++ symbols on your behalf.
Take a moment to analyze the source code and Makefile, then move on to the next task.
Time to get your hands dirty! Modify the instrumentation callbacks in a way that will allow you to measure the time spent in each function. In other words, measure the elapsed time between entering and exiting a function.
If one of these functions contains calls to other instrumented functions deduct their elapsed time. For example, after calculating the time that elapsed from entering until exiting main(), subtract the elapsed times of tcp_connect(), send_query() and recv_response(). For functions such as printf() that were not instrumented at compile time, there's noting to be done.
Add a destructor to the instrumentation callback library in which you will display these statistics when the program terminates.
util.h
. This is a wrapper over the RDTSC instruction and it's the most efficient method of calculating the elapsed time. You can find a usage example here. Keep in mind that your CPU increments this timestamp counter a fixed number of times per second. This frequency is also the base frequency of your CPU. You can find it expressed in kHz in /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
.
If you are working inside a VM or simply don't want to use rdtsc, use clock_gettime() instead. Choose a monotonic timer that fits your needs. Check each timer's resolution and explain why it matters.