Since I've been doing some tests lately involving how applications use the video card, I thought I'd write down the process of gathering this data and presenting it. After all, any sufficiently "scientific" test should be repeatable by others, and being repeatable means knowing what to do!
What data am I gathering?
This is to see how the CPU is being used by the application. The higher the usage overall, the more likely it is to bottleneck the GPU. I may omit this altogether if I'm not interested in CPU utilization.
GPU engine usage
A "GPU engine" is something that Microsoft calls a part of the GPU that handles a certain task. Which engines are available depends on the GPU manufacturer. The two I'm primarily interested in are the graphics and compute engines, because these two will show how the execution portions of the GPU are being used by the application. This can only be used for informational purposes, i.e., there is no "lower/higher is better" value.
Since Windows Vista, Microsoft implemented virtual memory on a system level. This allows me to look at three elements: Committed VRAM (how much was requested to be reserved), Dedicated VRAM usage (how much is used on the video card itself), and Shared VRAM usage (which is VRAM usage in system memory). Like GPU engine usage, this can only be used for informational purposes.
This is the amount of time between frames. As long as VSync or frame limiting is not used, this should represent how long it took to render the frame. The inverse of this is what I call "instantaneous FPS," which is the time between the current and last frame normalized over a second. I call this "instantaneous" since FPS would require counting all of the frames in a second.
What data am I not gathering?
I'm not looking at temperatures, clock speeds, and fan speeds. These are aspects of hardware that don't reflect how the application is using it.
What tools am I using?
Performance Monitor (PerfMon)
PerfMon gathers CPU utilization, GPU engine usage, and VRAM usage. Other tools like GPU-Z and MSI Afterburner cannot gather this data, at least with respect to the specific aspects I'm looking for. The other thing is that PerfMon can gather data per-application. Meaning the data I gather is specifically from the application in question, rather than on a system wide level.
While FRAPS is old (the last update was in 2013) and the overlay no longer seems to work in DX12 applications, its benchmark functionality still works. This allows me to gather data about frame times. Note that FRAPS counts a frame as when one of the display buffers flips. This poses a limitation when VSync is enabled but the application is not triple buffered or when frame rate limiting is used.
How do I use these tools?
PerfMon takes some setting up:
- Open it by going to Control Panel -> All items -> Administrative Tools -> Performance Monitor. Open it as an Administrator, otherwise you won't be able to do the other steps.
- Select "Data Collector Sets" in the left pane
- Right click "User Defined" in the right pane and select New -> Data Collector Set
- In the wizard that pops up, name the Data Collector Set, choose "Create manually (Advanced)"
- In the next page, select "Create data logs" and check off "Performance counter"
In the next page, click on the "Add..." button, the select the following:
- GPU Engine -> Utilization for All Instances
- GPU Memory -> Committed, Dedicated, and Shared memory for All Instances
- If doing CPU utilization, select Processor -> "% Processor Time" for All Instances
- The next page will ask where you want to save these logs
When you want to start the data collection, select the one you created and on the tool bar on the top, press the green triangle. To stop collecting data, press the black square.
- Note: PerfMon gathers GPU data by the apps using the GPU that are currently running when the collection starts. If the app isn't running and you start data collecting, it won't gather data for that app.
- To open the log, go to where you said to save the data and double click on it.
- The data collected for each app is by process ID. Unless you figured this out ahead of time, the best way I've found to find it is to plot all of the 3D or graphics engines and see which one looks like the process ID of the app. Then I sort by name, then remove the data from the other process IDs.
- Once the data has been filtered, right click on the graph and select "Save Data"
- Save it as a "Text File - Comma Separated Values (CSV)"
Once you have the data in a CSV format, you should be able to manipulate this data using spreadsheet apps like Microsoft Excel or Open/Libre Office Calc.
FRAPS requires pressing F11, or whatever the benchmark hotkey is, to start then pressing it again to stop. FRAPS saves the data as CSV. The items of interest are frame times and MinMaxAvg data. Frame times do require additional work as FRAPS records the timestamp in milliseconds from the start of the run rather than the time between frames.
What other tools did I consider and why weren't they used?
EVGA Precision X
Polls system wide stats. Also, while it it has a frame rate counter, it samples it over the period which can mask hiccups (and it's likely based on the inverse of FPS). While higher sampling rates can be used, I noticed this adds a significant use to the GPU.
Polls system wide stats.
Polls system wide stats. May also have the same issues as EVGA Precision X.