Applications with great performance are a hallmark of C++ programming, and one of the best tools for achieving great application performance is a code profiler. Take a look at the Visual Studio Team System (VSTS) profiler, and see how it can be used to find code bottlenecks and improve performance.
The Visual Studio Team System (VSTS) Profiler ships as part of the Developer Edition product, and is now in its second version with the release of VSTS Team System 2008. The initial release of the VSTS Profiler was the first profiler that shipped as part of the Visual Studio/VSTS product family since the Visual C++ 6 profiler, and although the profiler had good fundamentals, the VSTS 2005 release was very much a Version 1 product, and could be difficult to use for some profiling tasks. With VSTS 2008, the Profiler is much improved, and well worth considering as the first-choice profiler in a developer's toolkit.
Many commercial profilers operate in a single mode in which the amount of time spent executing each line of code within an application is measured. The problem with this approach is that the measurement can have a very large performance impact on the application being profiled, and wading through the huge amount of data collected can be an overwhelming task. The VSTS profiler takes a different approach, and there are two distinct profiling modes available: sampling and instrumentation. During sampling profiling, the execution of the application is periodically paused and the functions that are executing are recorded. Sampling profiling has a low impact, and allows the area of an application that is causing the performance problem to be identified. Once a specific binary with performance problems has been identified, the sampling mode can be switch to instrumentation, in which specific binaries are re-compiled with instrumentation probes inserted; this allows every execution of a piece of code to be recorded and reported on.
Getting Stated with Profiling
Starting a profiling session is quite simple: With VSTS Developer installed, a menu item called Analyze will be available as shown in Figure 1, and the Launch Performance Wizard... guides the user through setting up a profiling session.
Figure 1: Launching the Performance Wizard.
The first screen of the performance wizard prompts the user to select the application to profile. In addition to a list of the projects in the current solution, it is possible to select an executable, a DLL, or an ASP.NET application to profile, as shown in Figure 2.
Figure 2: Profiling Target.
The second screen of the wizard allows the profiling mode to be specified, and provides a reasonably detailed description of the two modes.
Figure 3: Profiling Mode.
The final screen of the wizard is simply a summary screen, and once the wizard has been completed, a performance session is added to the project. The performance session is displayed through the Performance Explorer window, and displays the profiling mode, the profiling targets, and the results of any profiling sessions that have been conducted. Starting a profiling session can be achieved by clicking the third button (the one with a green arrow) at the top of the Performance Explorer toolbar.
It also is possible to add new targets to a profiling session from the Performance Explorer Window, and this is particularly useful for a large project that typically will have many DLLs. Sampling profiling can be used to detect which DLL the performance problem is in, and instrumentation can be used to drill into the methods of the problem project. It is possible to turn profiling on and off for each target via a context menu. It would be usual to have sampling turned on for each target, but once instrumentation sampling was selected, it is advisable to reduce the number of targets it is applied too.
Figure 4: Performance Wizard.
Analyzing Performance Results
It is difficult to demonstrate the benefit of sampling profiling in a demonstration application that may only contain a couple of hundred lines of code, but for real-world applications that can contain dozens of DLLs and hundreds of thousands of lines of code, using sampling profiling to narrow down a performance problem is critical. Instrumentation sampling collects a huge amount of data, and it is easy to end up with a profile report spanning tens of gigabytes that is very slow to process both from a software and developer perspective.
Once a specific group of projects has been identified as the source of performance problems, instrumentation profiling makes tracking down the results relatively straightforward. The results of a profiling session, as shown in Figure 4, comprise information categorized in a number of different views accessible via a dropdown menu, as shown in Figure 5.
Figure 5: Performance Results.
For blatant performance problems, the Performance Summary Screen often can identify the performance-problem culprit with the % of time in method figures. For more subtle performance problems, switching to the other pages in the performance report will be required. The Call Tree View presents the call tree of an application along with the number of time a method has been called. There are also many other columns that can be added, including the ability to inspect the actual time spent in methods (both inclusive and exclusive timings) and function line number (for sampling profiling only). The other two views that are helpful for finding performance problems are the Caller/Callee View and the Functions view.
Managed Code Profiling
In addition to the native code profiling capabilities covered so far, the VSTS Code Profiler is equally capable profiling C#, Visual Basic.NET, and C++/CLI managed code. The basics of managed code profiling are the same as native code profiling, but there are a number of different options that deal with managed code-specific problems. The most significant difference is the ability to collect information about the allocation and lifetime of managed objects. As the lifetime of managed objects is controlled by the .NET garbage collector, diagnosing problems related to excessive memory consumption can be more difficult in .NET projects. Turning on the collection of .NET memory statistic is accomplished by bringing up the Properties of a Performance Session from the Performance Explorer window, and selecting the options shown in Figure 6.
Figure 6: Enabling Managed Memory Tracking.
The results of profiling a C++/CLI Windows forms application are shown in Figures 7 and 8. The results in Figure 7 will be available when the Collect .NET object allocation information checkbox is selected, and Figure 8 is available when is selected. As the figures show, the information is displayed by type, and while this can be useful for diagnosing general issues with over-allocation of a type, per-type allocation and lifetime information is a lot less useful than per-object statistics that tools like the SciTech MemProfiler.
Figure 7: Selecting the Collect .NET object allocation information checkbox.
Figure 8: Choosing Also collect .NET object lifetime information.
Improvements in the performance and features of the VSTS profiler that come with the VSTS 2008 release make the profiler a very capable performance-tuning tool for the C++ (and other VSTS language) developer. The use of two profiling modes makes the profiler capable of scaling to the largest of C++ code bases, and finding the most obscure performance problem can be accomplished using the wealth of information produced by instrumentation profiling. Developers who haven't settled on a preferred profiler will be well served by giving the VSTS Developer Edition Profiler a spin and seeing how well it fits their performance tuning needs.
About the Author
Nick Wienholt is an independent Windows and .NET consultant based in Sydney. He is the author of Maximizing .NET Performance and co-author of A Programmers Introduction to C# 2.0 from Apress, and specialises in system-level software architecture and development, with a particular focus of performance, security, interoperability, and debugging.
Nick is a keen and active participant in the .NET community. He is the co-founder of the Sydney Deep .NET User group and writes technical articles for Australian Developer Journal, ZDNet, Pinnacle Publishing, CodeGuru, MSDN Magazine (Australia and New Zealand Edition) and the Microsoft Developer Network. An archive of Nick's SDNUG presentations, articles, and .NET blog is available at www.dotnetperformance.com.
In recognition of his work in the .NET area, he was awarded the Microsoft Most Valued Professional Award from 2002 through 2007.