This is an old revision of the document!
Performance monitoring is the process of regularly checking a set of metrics and tracking the overall health of a specific system. Monitoring is tightly coupled with performance tuning, and a Linux system administrator should be proficient in these two subjects, as one of their main responsibilities is to identify bottlenecks and find solutions to help the operating system surpass them. Pinpointing a Linux system bottleneck requires a deep understanding of how various components of this operating system work (e.g. how processes are scheduled on the CPU, how memory is managed , the way that I/O interrupts are handled, the details of network layer implementation, etc). From a high level, the main subsystems that you should think of when tuning are CPU, Memory, I/O and Network.
These four subsystems are vastly depending on each other and tuning the whole system implies keeping them in harmony. To quote a famous idiom, “a chain is no stronger than its weakest link”. Thus, when investigating a system performance issue, all the subsystems must be checked and analysed.
Being able to discover the bottleneck in a system requires also understanding of what types of processes are running on it. The application stack of a system can be broken down in two categories:
Before going further with the CPU specific metrics and tools, here is a methodical approach which can guide you when tuning the performance of a system:
Before looking at the numerous performance measurement tools present in the Linux operating system, it is important to understand some key concepts and metrics, along with their interpretation regarding the performance of the system.
The kernel contains a scheduler which is in charge of scheduling two types of resources: interrupts and threads. The resources are assigned by the scheduler with different priorities. The following list presents the priorities:
While executing a process, the necessary set of data is stored in registers on the processor and cache. This group of information is called a context. Each thread owns an allotted time quantum to spend on the CPU, and when the time finishes or it is preempted by a higher priority task, a new ready to run process will be scheduled. When the next process is scheduled to run, the context of the current will be stored and the context of the new one is restored to the registers, this process being named context switch. Having a great volume of context switching is not desired because the CPU has to flush its register and cache each time, to gain room for the new process, which leads to performance issues.
Each CPU preserves its own run queue of threads. In an ideal scenario, the scheduler would be constantly executing threads. Threads can be in different states: runnable - processes which are ready to be executed or in a sleep state - being blocked while waiting for I/O. If the system has performance issues or it’s overloaded, then the queue starts to fill up and a process thread will take longer to execute.
The same concept is known also as “load”. This term is measured by load average, which is a rolling average of the sum of the processes waiting to be processed and the processes waiting for uninterruptible task to be completed. Unix systems traditionally present the CPU load as 1-minute, 5-minute and 15-minute averages.
The CPU Utilisation is a meaningful metric to observe how the running processes make use of the given processing resources. You can find the following categories the vast majority of performance monitoring tools:
The Linux distributions have various monitoring tools available. Some of the utilities deal with metrics in a single tool, providing well formatted output which eases the understanding of the system performance. Other tools are specialized on more specific metrics and give us detailed information.
Some of the most important Linux CPU performance monitoring tools:
Tool | Most useful function |
---|---|
vmstat | System activity |
top | Process activity |
uptime, w | Average system load |
ps, pstree | Displays the processes |
iostat | Average CPU load |
sar | Collect and report system activity |
mpstat | Multiprocessor usage |
Understanding how well a CPU is performing is a matter of interpreting the run queue, its utilisation, and the amount of context switching performed. Although performance is relative to baseline statistics, in the absence of these statistics, the following general performance expectations of a system can be used as a guideline:
The following two examples give interpretations of the outputs generated by vmstat.
The following observations can be made based on this output:
The following observations can be made based on this output:
These examples are from Darren Hoch’s Linux System and Performance Monitoring.
For this lab, we will use Google Colab for exploring numpy and matplotlib. Please solve your tasks here by clicking “Open in Colaboratory”.
You can then export this python notebook as a PDF (File → Print) and upload it to Moodle.
Please take a minute to fill in the feedback form for this lab.