blog · git · desktop · images · contact · gopher
time(1)
and CPU frequency scaling2021-07-14
This is not new, but I only fully realized it today. Yes, I’m slow.
I often do this:
$ time -p some_long_running_command
And then I look at the user
field to find out how much “CPU time in
user space” the command has used.
Let’s have a look at an example:
$ time -p echo 'scale=4000; a(1)*4' | bc -l >/dev/null
real 8.37
user 8.36
sys 0.00
This calculates a couple thousand digits of Pi. According to time(1)
,
the CPU needed 8.36 seconds to do this and it took 8.37 seconds in
wallclock time. sys
is zero, so it spent pretty much no time in kernel
mode. Okay.
To give you some more context: We use config management at work and I wanted to know how long it takes to compute the configuration for a particular machine. I wanted to do this as a daily cronjob and then collect and plot those values over time. The goal was to see if/when the code we wrote got slower. In other words, this was supposed to run completely as a background task and it was supposed to be unaffected by other processes running on the same system. I wanted to know how much “work” the CPU had to do to compute the configuration. If we write suboptimal code, the CPU has to do more “work” and thus wastes time.
So, let’s use the above bc
as a supplement for my actual task. It’s
just something that computes something else. Good.
Let’s have a look at what happens when I spawn a couple of while true;
do true; done
in the background – this is just to simulate some
background load. I always expected this to not change the result of
user
, because user
is supposed to be the CPU time of my process.
So, no matter what, it should stay the same. real
, on the other hand,
might increase. real
doesn’t really matter in my case and can be
ignored.
To my surprise, this happened:
$ time -p echo 'scale=4000; a(1)*4' | bc -l >/dev/null
real 11.75
user 11.55
sys 0.00
real
has increased, but so has user
. Why?
After a couple of wild guesses, I realized that dynamic CPU frequency scaling is the reason.
Running watch -n 0.1 grep MHz /proc/cpuinfo
in another terminal shows
that my CPU (an Intel i7-8665U) runs at 2.1 GHz when idling. Or maybe
less. As soon as I start bc
, it jumps to 4.3 GHz on one core. When I
leave it at that, bc
finishes after about 8.5 seconds.
But with all the while true; do true; done
in the background, the CPU
cannot sustain 4.3 GHz. It stays at about 3.3 GHz. That’s a drop of
about 25%, which more or less equals the increase of the user
value.
Depending on what I do, the CPU can only sustain about 2.3 GHz on all cores. That is … not a lot.
The root cause is that the user
field of time(1)
reports the time
spent on the CPU. That’s not what I wanted to know, though. I really
wanted to know:
But what time(1)
tells me is:
Slightly different thing.
With dynamic CPU frequency scaling, it’s like constantly replacing the
CPU with a different model. Of course, a different CPU, which might be
faster or slower, might take a different time to run the same task. It’s
perfectly reasonable for time(1)
to report different times then.
How do you measure the “work” a CPU has to do? There’s no easy answer
here. One idea might be to measure the number of instructions (like
mov
, cmp
, and so on) the CPU has to compute, but not every
instruction takes the same amount of time … well, not really “time”, but
CPU cycles. And now we’re getting somewhere.
This yields pretty much the same result, no matter the background load and no matter the CPU frequency:
$ perf stat -e cycles:u sh -c "echo 'scale=4000; a(1)*4' | bc -l >/dev/null"
Performance counter stats for 'sh -c ...'
35,246,938,315 cycles:u <---- the interesting value
8.395594299 seconds time elapsed
8.366890000 seconds user
0.010255000 seconds sys
It measures the number of CPU cycles spent in user space (hence the
:u
).
This is not a complete picture. There are many variables, the computer is more than just the CPU, and that’s why the original idea of measuring time as a cumulative metric is not a bad idea, actually. You just have to make sure that your CPU frequency doesn’t jump like crazy (well, I had to do that in my case).
One way to stabilize it is to disable “turbo” mode temporarily:
cd /sys/devices/system/cpu/
for i in $(seq $(nproc) )
do
cpuid=$((i - 1))
sudo sh -c "echo 1900000 >cpu${cpuid}/cpufreq/scaling_max_freq"
done
This limits my CPU to 1.9 GHz. Timing is stable now. Of course, everything is slow as hell.
I later found out that this particular CPU exposes this issue exceptionally good. It’s a laptop. It’s a low-power CPU. It tries hard to conserve power. And maybe the cooling system just isn’t good enough.
My other machines hardly expose this issue at all. My i7-3770 from 2013 jumps to 3.9 GHz in “turbo” mode and might drop to 3.7 GHz when there’s a lot of background load. The Celeron N3050 in my Intel NUC jumps to 2.1 GHz in “turbo” mode – and then just stays there, no matter the background load. On those two machines, the problem doesn’t really manifest.
Is it wrong to measure processes using time(1)
? In my particular
case, it was wrong. But it really depends on what you want to know. You
can also look at it like this: The fact that the user
value increases
and the frequency drops so much actually tells you that this might be a
bit too much load for your machine.
At any rate, it’s important to know that the output of time(1)
is not
as “stable” as you might think.
The discussion of this blog post on Hacker News mentions a blog post by Raymond Chen: What should the CPU usage be of a fully-loaded CPU that has been throttled? Yeah, that’s a really good question. :-)