This was brought up in a conversation about how Windows reports processor utilization, whether it's the CPU or GPU. You may be tempted to think that processor utilization is or should be the percentage of how much the processor is being used over time rather than something like idle time, which is defined as the CPU running the OS's idle task. In a previous blog post, I did touch upon this, but I don't think the reasoning I put painted the entire picture. Basically I said:
The other thing is Windows can't tell if a thread is doing useful work or not unless it somehow observes what the thread is doing. Except the problem there is that requires CPU time. So it has to interrupt a thread (not necessarily the one its observing) to do this. And how often do you observe what a thread is doing? Adding this feature just to get a more accurate representation of CPU utilization is likely going to decrease general performance due to overhead.
Or rather, this was looking at it from a point of view that utilization should be when the CPU is executing something "useful", whatever that means. So let's expand upon why the OS reports processor utilization as idle time instead of something that may make more sense like, well, CPU resource utilization.
Backup: What is the "Idle process?"
The idle process is a special process or program that runs on the OS. It's always there and it's running when the OS has nothing else scheduled for the CPU to do. That is, none of the other apps have something they want to do because they're waiting on something. So processor utilization is the percentage of time that the idle process was running over some sampling period.
Of note, the idle process isn't something that runs only NOPs (no operation) forever or goes right to sleep, but often takes care of background tasks. In portable systems, it may put the system into a low power mode once certain conditions are met.
How do you event define "utilization" from a resource usage standpoint?
Let's take a look at the block diagram for AMD's Zen microarchitecture (I don't know why I keep using it, but it's a nice diagram. Also from https://en.wikichip.org/wiki/File:zen_block_diagram.svg)
There are a lot of things here you could count as being used:
- How much of cache is not dirty?
- How many instructions is the decoder working on?
- How many uOPs are being queued up?
- How full is that retire queue?
- How many schedulers are being pinged with something to do?
- Those schedulers have a queue too, how full are they?
- How many execution units are being used?
- etc. etc. etc.
So if you wanted to know how much of the CPU resources are being used, there's a lot of things you have to keep tabs of. If you really want to know the details of each of the above, that would require silicon to keep track of it. Even though this silicon likely wouldn't take up a lot of space, I'm pretty sure people would rather want that put somewhere else.
However, there lies another problem: you'll almost never reach 100% utilization. The first is that unless you write completely linear code from start to finish, you'll have gaps due to branching. Another is that mixing integer and floating point together is a hassle, and so typically some things are integer only and some things are floating point only. You'll likely not saturate the integer and FP side of the CPU. On the opposite side, you'll never reach 0% utilization, at least completely. 0% means the CPU isn't running period, and the CPU is always running something.
It's about the perspective
It's important to note that when displaying information from a system, you have to take account who's viewing it and when it should be necessary to show it. For example, a typical car that most people drive. There are many more parameters than the car's instrument panel shows that the car is measuring and taking note of. Things like oil pressure, battery voltage, tire pressure, oxygen content in the intake, whether or not the brakes are locking, whether or not the tires are slipping, etc. But none of this is shown as a value on the instrument panel. The most you get out of this is some warning light that comes on only when there's a problem. Heck, even in my car, there's no gauge for engine temperature which is something that was on many cars. So for the purposes of the average Joe, if all of the values for these where somewhere on the instrument panel, it would be information overload. Most of these values have no meaning to the person unless there's an actual problem. Worst yet, they may misinterpret some of the values as being dangerous, even though there's nothing wrong with the car. Like 220F sounds pretty hot, but this is a normal temperature for oil when the car is warmed up.
Similarly with a computer, knowing all of this information may be useful, but for most people that are using the computer, it's not values that are really important. It doesn't even have to be from the user's perspective, but from the software's as well. What would happen if average Joe saw that the execution units on the CPU weren't at 100%? What does uOp queue even mean? Certain values aren't necessarily bad per se and even if they knew what these values meant, they couldn't do anything about it anyway.
So the two major perspectives that I want to go over as far as CPU utilization is concerned is from the operating system's and the application's points of view.
The Operating System's job is to service applications
The job of the operating system is to manage the resources available to the computer so that applications that run on it can share those resources. In order to do so, it has to look at everything from the system point of view, which means the hardware and software in total, not just a single piece. Once the application has a resource, the OS largely doesn't care what the application does with this resource and assumes the application will either get whatever it wanted done or tell it that it doesn't need the resource anymore. Or to put in another way, a manager has employees to look after, but the manager's job is to assign tasks to the employees. The manager largely doesn't care how it's done (assuming there's no stringent company requirements) as long as the task is done.
The application just wants to use the CPU, so it cares how it's used
The way most modern operating systems we use allocates the CPU to applications is it gives the application a slice of time to run. Once the slice of time is up or the application releases control of the CPU, the OS takes over and finds some other thing to put on the CPU. So from this view point, it's important to maximize the usage of the CPU as much as you can for the tasks you need to complete.
However if the OS is only going to report utilization as the percentage of time spent running the Idle Process, how can one figure out how the app is using the CPU? After all, it's useful for needing to know where there are performance choke points. And there is such a way: by the way of profilers. However, profilers are rather intrusive since they interact with many intimate parts of the software and possibly the CPU. I alluded to this in my previous post, and at some point, profiling the application's CPU usage to get more accurate results showing all the time can cause a much larger performance degradation for information that really doesn't add much value once the application is shipped.
On the flipside, does the application care how other applications are using the CPU? Not really. After all, if my application is on the CPU, it should be the only thing on the CPU. Even in a simultaneous multithreading environment, I might care what the other app is doing, but even if I did know what was going on, my app can't do anything about it.
The information can also be misinterpreted
Going back to the car example I made earlier, even if the OS can provide processor resource usage, this information can be misinterpreted by the average end-user. What if we had two processors, one with say a smaller instruction queue size than another. We give both processors the same task and of course, they would report back different utilization values. This can be interpreted to mean by someone who doesn't know better than one processor works harder than the other because their instruction queue is fuller percentage-wise or another could say that one processor is worse than the other for the same reason. It doesn't really matter. If they both complete the task at the same time, they're both equally as good.
What do most end-users really care about?
For most end-users, you have to ask yourself what information they would really care about. For instance, what's a better metric when troubleshooting system issues? Knowing that your uOP queue is 80% full and your integer schedulers have 10 entries each or knowing that App A is hogging up most of the CPU time and that's probably slowing everyone else down?
Or for perhaps a better thing a user would commonly encounter: Wi-Fi. The typical user doesn't care what protocol is being used, how fast the speed is, what the exact signal strength is, etc. The typical user only cares about two things: 1. How strong is the signal on a scale that I can understand? and 2. Am I connected?