This shows you the differences between two versions of the page.
ep:laboratoare:02 [2016/10/25 23:27] emilian.radoi |
ep:laboratoare:02 [2017/10/02 20:27] (current) emilian.radoi |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Laboratorul 02. ====== | + | ====== Tutorial 02 ====== |
The material for this tutorial was taken from Darren Hoch’s “Linux System and Performance Monitoring”. You can access it at: http://ufsdump.org/papers/oscon2009-linux-monitoring.pdf. | The material for this tutorial was taken from Darren Hoch’s “Linux System and Performance Monitoring”. You can access it at: http://ufsdump.org/papers/oscon2009-linux-monitoring.pdf. | ||
Line 50: | Line 49: | ||
- Convert 1 of 166 to decimal (1/166 = 0.006 seconds per Rotation) | - Convert 1 of 166 to decimal (1/166 = 0.006 seconds per Rotation) | ||
- Multiply the seconds per rotation by 1000 milliseconds (6 MS per rotation) | - Multiply the seconds per rotation by 1000 milliseconds (6 MS per rotation) | ||
- | - Divide the total in half (6/2 = 3 MS) or RD | + | - Divide the total in half (6/2 = 3 MS) (RD is considered half a revolution around a disk) |
- Add an average of 3 MS for seek time (3 MS + 3 MS = 6 MS) | - Add an average of 3 MS for seek time (3 MS + 3 MS = 6 MS) | ||
- Add 2 MS for latency (internal transfer) (6 MS + 2 MS = 8MS) | - Add 2 MS for latency (internal transfer) (6 MS + 2 MS = 8MS) | ||
Line 57: | Line 56: | ||
Each time an application issues an I/O, it takes an average of 8MS to service that I/O on a 10K RPM disk. Since this is a fixed time, it is imperative that the disk be as efficient as possible with the time it will spend reading and writing to the disk. The amount of I/O requests is often measured in I/Os Per Second (IOPS). The 10K RPM disk has the ability to push 120 to 150 (burst) IOPS. To measure the effectiveness of IOPS, divide the amount of IOPS by the amount of data read or written for each I/O. | Each time an application issues an I/O, it takes an average of 8MS to service that I/O on a 10K RPM disk. Since this is a fixed time, it is imperative that the disk be as efficient as possible with the time it will spend reading and writing to the disk. The amount of I/O requests is often measured in I/Os Per Second (IOPS). The 10K RPM disk has the ability to push 120 to 150 (burst) IOPS. To measure the effectiveness of IOPS, divide the amount of IOPS by the amount of data read or written for each I/O. | ||
- | **Go through Ex00 (at the end of the document)** | + | **Go through [[02#exercices|Ex00]]** |
==== Random vs Sequential I/O ==== | ==== Random vs Sequential I/O ==== | ||
Line 65: | Line 64: | ||
Sequential I/O - The **iostat** command provides information on IOPS and the amount of data processed during each I/O. Use the **–x** switch with **iostat** (//iostat –x 1//). Sequential workloads require large amounts of data to be read sequentially and at once. These include applications such as enterprise databases executing large queries and streaming media services capturing data. With sequential workloads, the KB per I/O ratio should be high. Sequential workload performance relies on the ability to move large amounts of data as fast as possible. If each I/O costs time, it is imperative to get as much data out of that I/O as possible. | Sequential I/O - The **iostat** command provides information on IOPS and the amount of data processed during each I/O. Use the **–x** switch with **iostat** (//iostat –x 1//). Sequential workloads require large amounts of data to be read sequentially and at once. These include applications such as enterprise databases executing large queries and streaming media services capturing data. With sequential workloads, the KB per I/O ratio should be high. Sequential workload performance relies on the ability to move large amounts of data as fast as possible. If each I/O costs time, it is imperative to get as much data out of that I/O as possible. | ||
- | **Go through Ex01 (at the end of the document)** | + | **Go through [[02#exercices|Ex01]]** |
Random I/O - Random access workloads do not depend as much on size of data. They depend primarily on the amount of IOPS a disk can push. Web and mail servers are examples of random access workloads. The I/O requests are rather small. Random access workload relies on how many requests can be processed at once. Therefore, the amount of IOPS the disk can push becomes crucial. | Random I/O - Random access workloads do not depend as much on size of data. They depend primarily on the amount of IOPS a disk can push. Web and mail servers are examples of random access workloads. The I/O requests are rather small. Random access workload relies on how many requests can be processed at once. Therefore, the amount of IOPS the disk can push becomes crucial. | ||
Line 81: | Line 80: | ||
To see the effect the swapping to disk is having on the system, check the swap partition on the drive using **iostat**. | To see the effect the swapping to disk is having on the system, check the swap partition on the drive using **iostat**. | ||
- | {{ :ep:laboratoare:ep2_poz2.png?600 |}} | + | {{ :ep:laboratoare:ep2_poz2.png?650 |}} |
Both the swap device (///dev/sda1//) and the file system device (///dev/sda3//) are contending for I/O. Both have high amounts of write requests per second (//w/s//) and high wait time (//await//) to low service time ratios (//svctm//). This indicates that there is contention between the two partitions, causing both to underperform. | Both the swap device (///dev/sda1//) and the file system device (///dev/sda3//) are contending for I/O. Both have high amounts of write requests per second (//w/s//) and high wait time (//await//) to low service time ratios (//svctm//). This indicates that there is contention between the two partitions, causing both to underperform. | ||
Line 95: | Line 94: | ||
===== 03 Introducing Network Monitoring ===== | ===== 03 Introducing Network Monitoring ===== | ||
+ | |||
+ | Out of all the subsystems to monitor, networking is the hardest to monitor. This is due primarily to the fact that the network is abstract. There are many factors that are beyond a system’s control when it comes to monitoring and performance. These factors include latency, collisions, congestion and packet corruption to name a few. | ||
+ | |||
+ | This section focuses on how to check the performance of Ethernet, IP and TCP. | ||
+ | |||
+ | ==== Ethernet Configuration Settings ==== | ||
+ | |||
+ | Unless explicitly changed, all Ethernet networks are auto negotiated for speed. The benefit of this is largely historical when there were multiple devices on a network at different speeds and duplexes. | ||
+ | |||
+ | Most enterprise Ethernet networks run at either 100 or 1000BaseTX. Use **ethtool** to ensure that a specific system is synced at this speed. | ||
+ | |||
+ | In the following example, a system with a 100BaseTX card is running auto negotiated in 10BaseT. | ||
+ | |||
+ | {{ :ep:laboratoare:ep2_poz4.png?450 |}} | ||
+ | |||
+ | The following command can be used to force the card into 100BaseTX: //# ethtool -s eth0 speed 100 duplex full autoneg off//. | ||
+ | |||
+ | ==== Monitoring Network Throughput ==== | ||
+ | |||
+ | It is impossible to control or tune the switches, wires, and routers that sit in between two host systems. The best way to test network throughput is to send traffic between two systems and measure statistics like latency and speed. | ||
+ | |||
+ | === Using iptraf for Local Throughput === | ||
+ | |||
+ | The **iptraf** utility (http://iptraf.seul.org) provides a dashboard of throughput per Ethernet interface. (Use: //# iptraf –d eth0//) | ||
+ | |||
+ | === Using netperf for Endpoint Throughput === | ||
+ | |||
+ | Unlike **iptraf** which is a passive interface that monitors traffic, the **netperf** utility enables a system administrator to perform controlled tests of network throughput. This is extremely helpful in determining the throughput from a client workstation to a heavily utilised server such as a file or web server. The **netperf** utility runs in a client/server mode. | ||
+ | |||
+ | To perform a basic controlled throughput test, the **netperf** server must be running on the server system (//server# netserver//). | ||
+ | |||
+ | There are multiple tests that the **netperf** utility may perform. The most basic test is a standard throughput test. The following test initiated from the client performs a 30 second test of TCP based throughput on a LAN. | ||
+ | The output shows that that the throughput on the network is around 89 mbps. The server (192.168.1.215) is on the same LAN. This is exceptional performance for a 100 mbps network. | ||
+ | |||
+ | {{ :ep:laboratoare:ep2_poz5.png?430 |}} | ||
+ | |||
+ | Another useful test using **netperf** is to monitor the amount of TCP request and response transactions taking place per second. The test accomplishes this by creating a single TCP connection and then sending multiple request/response sequences over that connection (ack packets back and forth with a byte size of 1). This behavior is similar to applications such as RDBMS executing multiple transactions or mail servers piping multiple messages over one connection. | ||
+ | |||
+ | The following example simulates TCP request/response over the duration of 30 seconds. | ||
+ | |||
+ | {{ :ep:laboratoare:ep2_poz6.png?450 |}} | ||
+ | |||
+ | In the previous output, the network supported a transaction rate of 4453 psh/ack per second using 1 byte payloads. This is somewhat unrealistic due to the fact that most requests, especially responses, are greater than 1 byte. | ||
+ | |||
+ | In a more realistic example, a **netperf** uses a default size of 2K for requests and 32K for responses. | ||
+ | |||
+ | {{ :ep:laboratoare:ep2_poz7.png?470 |}} | ||
+ | |||
+ | The transaction rate reduces significantly to 222 transactions per second. | ||
+ | |||
+ | === Using iperf to Measure Network Efficiency === | ||
+ | |||
+ | The **iperf** tool is similar to the **netperf** tool in that it checks connections between two endpoints. The difference with **iperf** is that it has more in-depth checks around TCP/UDP efficiency such as window sizes and QoS settings. The tool is designed for administrators who specifically want to tune TCP/IP stacks and then test the effectiveness of those stacks. The **iperf** tool is a single binary that can run in either server or client mode. The tool runs on port 5001 by default. In addition to TCP tests, **iperf** also has UDP tests to measure packet loss and jitter. | ||
+ | |||
+ | ==== Individual Connections with tcptrace ==== | ||
+ | |||
+ | The **tcptrace** utility provides detailed TCP based information about specific connections. The utility uses **libpcap** based files to perform an analysis of specific TCP sessions. The utility provides information that is at times difficult to catch in a TCP stream. This information includes: | ||
+ | * TCP Retransmissions – the amount of packets that needed to be sent again and the total data size | ||
+ | * TCP Window Sizes – identify slow connections with small window sizes | ||
+ | * Total throughput of the connection | ||
+ | * Connection duration | ||
+ | |||
+ | For more information refer to pages 34-37 from Darren Hoch’s “Linux System and Performance Monitoring” - http://ufsdump.org/papers/oscon2009-linux-monitoring.pdf. | ||
+ | |||
+ | |||
+ | ==== Conclusion ==== | ||
+ | |||
+ | Takeaways for network performance monitoring: | ||
+ | * Check to make sure all Ethernet interfaces are running at proper rates. | ||
+ | * Check total throughput per network interface and be sure it is inline with network speeds. | ||
+ | * Monitor network traffic types to ensure that the appropriate traffic has precedence on the system. | ||
+ | |||
+ | |||
Line 107: | Line 179: | ||
* Run //iostat –x 1 5// | * Run //iostat –x 1 5// | ||
- | * Considering the last two outputs outputs provided by the previous command, calculate the efficiency of IOPS for each of them. Does the amount of data written per I/O increase or decrease? | + | * Considering the last two outputs provided by the previous command, calculate the efficiency of IOPS for each of them. Does the amount of data written per I/O increase or decrease? |
Hint | Hint | ||
- | * Divide the reads per second (//r/s//) and the writes per second (//w/s//) by the kilobytes read (//rkB/s//) and written (//wkB/s//) per second. | + | * Divide the kilobytes read (//rkB/s//) and written (//wkB/s//) per second by the reads per second (//r/s//) and the writes per second (//w/s//). |
==== Ex02 ==== | ==== Ex02 ==== | ||
Line 121: | Line 193: | ||
* Split the file into smaller chunks that would fit in memory (e.g. 4GB). | * Split the file into smaller chunks that would fit in memory (e.g. 4GB). | ||
* Use a classical sort algorithm for sorting these chunks. | * Use a classical sort algorithm for sorting these chunks. | ||
- | * Merge the sorted chunks two by two. Read in memory only one number from each chunk. Start from the beginning of each chunk and compare the numbers one by one and right the smallest in a new file. | + | * Merge the sorted chunks two by two. Read in memory only two numbers at a time (one from each chunk) starting from the beginning, compare the numbers, write the smallest in the merged file, and read the next number from the chunk that had its number written in the merged file. |
* Repeat the last step until you obtain the original file sorted. | * Repeat the last step until you obtain the original file sorted. | ||