This is an old revision of the document!


Lab 08 - I/O Monitoring (Windows)

Since computers started to surface, for many people it was a mystery what was happening behind the screen and it seemed magical when it wasn't working and even more magical when it was working. Since Linux is open-source, all sorts of tools appeared over time to analyse problems when they came up. On Windows on the other hand, the system being closed made it harder for tools to appear.

The first tools were provided by Sysinternals. These were written by Mike Rusinovich, who chose to make public tools such as File monitor and Registry monitor, which were later combined into Process monitor. The tools were so good that even Microsoft's support teams were using them. Seeing their usefulness and appreciating the know-how of their operating system, Microsoft decided to buy Sysinternals, so now the original website www.sysinternals.com redirects to https://technet.microsoft.com/en-us/sysinternals (outside Romania it probably redirects to a different link due to localization reasons that consider the language of the country where redirection is made).

On this website can be found some of the tools that will be used in this tutorial - Process Monitor, Process Explorer, VMMap, Autoruns.

Starting with Windows 7, Microsoft has begun to invest more and more in the performance of the system and in ways to monitor the system's performance. Some tools already existed since Windows 2000, but they were only used internally.

From here on out the tutorial will present the Microsoft tools.

01 Task Manager

Task Manager is present on windows systems since they had a graphical interface, which is for more than 20 years. However, this tool evolved significantly starting with Windows 8.1, providing more meaningful information. The tutorial is presenting the Task manager tool from Windows 10, since it represents the most evolved version.

To start Task manager use the shortcut: Ctrl + Shift + Esc.

The tools has several tabs:

Processes Tab

Shows all the running processes and their current resource usage in terms of CPU, Memory, Disk and Network.

Performance Tab

Shows the usage level of the computer's main resources in the last 60 seconds.

App history Tab

This tab was first added to Windows 8, and it shows the resource consumption of metro applications. Metro applications are touch-screen-friendly applications written especially for Microsoft's WinRT programming interfaces.

Start-up Tab

This tab shows all the applications that start at start-up, (or at least in Microsoft's vision - this will be further detailed in the Autoruns section), and their impact on the boot time. It is helpful to check this tab in case your computer takes a long to to start up.

Users Tab

Shows the resource consumption of every logged in user. The screenshot below shows that there is only one user logged in.

Details Tab

This tab shows details for each process - pid, status, the user under which it runs. Right-clicking the column headers bar, offers the possibility to add or remove columns. In the screenshot presented below the following columns were added: Handles, Threads, Image Path Name and Command Line. These new columns are very useful: the first one (Handles) when investigating a handle leak, the second one (Threads) in the case of investigating processes that create too many threads, the third one (Image Path Name) to find out the path from where the process was started, and the last one (Command Line) to find out the parameters with which it was started.

Services Tab

Shows the service status. A Windows service can be considered similar to a Linux daemon: a process without a visual interface, offering services to user-created processes. For more details you can can have a look here: https://en.wikipedia.org/wiki/Windows_service.

So far, Task Manager can be used to identify which process uses a lot of RAM, CPU, accesses the disk many times or generates a lot of traffic on the network at a certain moment (Services tab). However, it does not offer information if in the long run, that same process is the one that generated the slowdown of the system. It does offer some information for longer periods of time, in the Start-up tab, which shows what process had higher impact at start-up, but does not specify the area that was impacted (disk space, RAM, CPU).

This led Microsoft to develop Windows Assessment and Deployment Kit (Windows ADK) https://www.microsoft.com/en-US/download/details.aspx?id=39982 You should download and install at least the Windows Performance Toolkit module. This module includes:

  • Windows Performance Recorder
  • Windows Performance Analyzer
  • Xperf

The part of the tutorial is focusing on how to determine the sources that cause the following problems: intensive disk usage, intensive RAM usage, intensive CPU usage, and intensive networking.

02 I/O Monitoring

This second part of the tutorial looks into analysing performance issues due to intensive disk use.

Windows Performance Recorder

Installing Windows ADK will install Windows Performance Recorder. Check by clicking the windows button and typing “windows performance recorder”.

Start Windows Performance Recorder by pressing Enter.

Click the More options button to get the list shown in the screenshot right below.

Make sure that you select the same check boxes as in the screenshot, but do not click start just yet. Create a new directory and copy the Logs.exe and GoodLog.exe files (from the resources attached to this tutorial that can be found at the end of this page) into this directory. The behaviour of these two executables is similar to logging applications that write logs to the disk. Open a terminal and change the path to the directory where you copied the files.

Start Windows Performance Recorder and right after run GoodLog.exe and then Logs.exe. Once the two applications finish running, click the Save button in Windows Performance Recorder.

After the capture is saved, the Open option will become available in Windows Performance Analyzer. When clicking the Open button it should open a window such as the one below.

Double clicking on Storage should display the following window.

In the upper-left corner of the newly opened window it can select Disk Usage, Utilization by Disk. Click on Utilization by Disk and select: Utilization by Process, Path Name, and Stack. This will generate the following output.

The graph looks interesting. Processes can be selected for observing their activity on the disk. It can be noticed that our processes are not shown. Run Logs.exe again while keeping Task Manager on.

This shows that there is activity on the disk. The question is why doesn't Windows Performance Analyzer show it. The way Windows Performance Recorder records activity is based on events generated by the Windows kernel. It registers to track the events, listens to them, and during the recording period it constantly samples which process uses which resource at the time of sampling. It sums up the number of time that a process was caught doing something. In our case, the two processes want to write to the disk, but they are not the ones that get to do the actual writing. They tell the system that they want to write, and the System process schedules the writing. The reason for this is targeting a more efficient disk writing, as the System process is trying to minimise the impact to the disk. This is why our process's writing is passed over to the System process.

This being the case, we are introducing a new tool that was mentioned at the beginning of the tutorial: Process Monitor.

Process Monitor

Process Monitor can be downloaded from:

https://technet.microsoft.com/en-us/sysinternals/processmonitor.aspx

Before running Process Monitor, which is a small tool without an installer, it is preferable to have the latest dbghelp.dll library. Windows 10 already has the latest version. If you are using another Windows, you can get the latest version by installing:

https://developer.microsoft.com/en-us/windows/downloads/windows-8-1-sdk

In the installation process choose to install at least the following:

After the installation is complete, a newer version of dbghelp.dll can be found in C:\Program Files(x86)\WindowsKits\10\Debuggers. Keep this in mind as it will be useful later in the tutorial.

Looking at the logs created by the two apps - bad.log, good.log - they are identical, but Logs.exe has a significantly longer running time compared to GoodLog.exe. Start Process Monitor.

If the 4 buttons in the black area on the upper part of the window are selected, Process Monitor will display the activity (in this order) for: registry, files, networking, process and thread activity. By unchecking them, the corresponding events will be no longer displayed.

In the menu bar there is the Filter field. If selected, it will trigger a dropdown menu that contains another Filter field. If this second Filter field is selected, it will open the window shown below. Replicate this on your computer.

From the two dropdown menus in the upper part of the context window, select ”Process Name” instead of ”Architecture” and ”is” instead of ”contains”. In the text filed add Logs.exe, click the Add button and then the OK button.

Open the terminal and run Logs.exe. After the program is done running, save the Process Monitor capture.

Use Ctrl + X to reset all the events captured in Process Monitor. Go to FilterFilter area, double-click on the filter that was just added and change Logs.exe with GoodLog.exe, then click Add and Ok.

Start GoodLog.exe and save the capture once the program finishes running.

Scroll down in the two capture-logs until you notice the activity for bad.log respectively good.log.

Notice the difference. On the left-hand side it is shown the faster logging process, and on the right-hand side the slower one. Look in the red highlighted area to see the difference. On the left-hand side the logging file is opened, followed by continuous writing, while on the right-hand side the file is opened and closed for every writing operation which explains the significant slowdown.

To recap, Task Manager shows what processes use the disk intensively at the current time, Windows Performance Recorder / Windows Performance Analyzer show who used the disc during a longer time period, although they were showing the activity as belonging to the System process instead of our process. Using Process Monitor we could identify our processes' entire activity and we could determine why one is slower than the other. But what if we could find out which line in the code causes the problem?

Go back to Process Monitor. Use the window of the badly written logging program (Logs.exe). Go to OptionsConfigure Symbols, which will open the window shown below.

If you have an older version of dbghelp.dll in C:\windows\system32/dbghelp.dll compared to the one that you have downloaded in C:\ProgramFiles(x86)\WindowsKits\10\Debuggers, then change the path to point to the newer version. Otherwise it makes no difference.

In the ”Symbol paths” text box there is a path like this:

srv*https://msdl.microsoft.com/download/symbols

This is the path to the Microsoft servers which are used to load the Microsoft binaries. Let's explain - any application that runs on a machine, besides its executable and its libraries, it loads the Microsoft core libraries such as kernel32.dll, ntdll.dll and others. To rebuild the call stack for a specific event (similar to crash dumps), Process Monitor needs pdbs created during the compilation of the binaries, as they are a database containing the memory address where each loaded function can be found. In Microsoft's case, these symbols can be used from the mentioned path, but in our case they can used from where we have the binaries as there we should also have pdbs (e.g. d:\Logs\), so change the path to:

srv*https://msdl.microsoft.com/download/symbols;d:\Logs\

The same directory containing the binary also contains the source files, so for the ”Source code paths” add the path to the .sln of the two projects (in my case d:\Logs). After this you are ready to click Ok.

In the log (D:\Logs\bad.log) go to CreateFile. Double-click to open the Event Properties window. Choose the Stack tab, scroll down and you can notice that in the main function of main_bad_log.cpp, at line 12 the opening takes place. Click the ”Source” button to view the source code containing the issue.

In “Task.7z” you have another example of two executables: good.exe and bad.exe. They both have the same outcome, the only difference being their running time (one of them is significantly slower). Identify the problem.

So far you have seen how to monitor the disk activity, how to identify who is generating it and how to figure out the issue by looking at the pdbs and the code.

Let's consider a new problem: investigating a handle leak. A handle leak consists of a process that opens files and does not close them. On modern computers if this action is performed millions of times, the system may become unresponsive and will either experience an overall slowdown or the application that causes this will eventually crash. You may think that millions of handles are impossible to reach, so it is not worth paying attention to this problem. However, imagine that there are services running on servers for years. As an example, having a handle leak every 2 seconds amounts for over 10 million handle leaks in a year. How should such problems be investigated?

Open up a terminal and run HandleLeak.exe. Check out the ”Details” tab in Task Manager after adding the ”Handles” column.

It can be noticed that the number of handles keeps growing. This is clearly a problem, but how do we investigate it? We will consider a new tool, Process Explorer.

Process Explorer

https://technet.microsoft.com/en-us/sysinternals/processexplorer.aspx

Run it as administrator. It is similar to Task Manager. Select the process that you are interested in, namely HandleLeak, and press Ctrl + H. The output should be similar to the one in the screenshot below.

Ctrl + H opens a window under the ”Process” section that displays all open handles along with information about them. Thus it will display file handles, registry handles, threads handles, and so on. There is another view (Ctrl + D) that displays all the loaded dlls.

So it can be noticed that the leaks are on the following file: D:\Logs\HandleLeak\leak.txt. This is very useful information, but it would be better to find out who is responsible for the leak in code. Run Process Monitor with a filter on HandleLeak.exe and to notice the stack where the leakage is happens.

Feedback

Please take a minute to fill in the feedback form for this lab.

Resources

ep/labs/08.1538820611.txt.gz · Last modified: 2018/10/06 13:10 by emilian.radoi
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0