Demonstrations of cpudist.
This program summarizes task on-CPU time as a histogram, showing how long tasks
spent on the CPU before being descheduled. This provides valuable information
that can indicate oversubscription (too many tasks for too few processors),
overhead due to excessive context switching (e.g. a common shared lock for
multiple threads), uneven workload distribution, too-granular tasks, and more.
Alternatively, the same options are available for summarizing task off-CPU
time, which helps understand how often threads are being descheduled and how
long they spend waiting for I/O, locks, timers, and other causes of suspension.
# ./cpudist.py
Tracing on-CPU time... Hit Ctrl-C to end.
^C
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 1 | |
4 -> 7 : 1 | |
8 -> 15 : 13 |** |
16 -> 31 : 187 |****************************************|
32 -> 63 : 89 |******************* |
64 -> 127 : 26 |***** |
128 -> 255 : 0 | |
256 -> 511 : 1 | |
This is from a mostly idle system. Tasks wake up occasionally and run for only
a few dozen microseconds, and then get descheduled.
Here's some output from a system that is heavily loaded by threads that perform
computation but also compete for a lock:
# ./cpudist.py
Tracing on-CPU time... Hit Ctrl-C to end.
^C
usecs : count distribution
0 -> 1 : 51 |* |
2 -> 3 : 395 |*********** |
4 -> 7 : 259 |******* |
8 -> 15 : 61 |* |
16 -> 31 : 75 |** |
32 -> 63 : 31 | |
64 -> 127 : 7 | |
128 -> 255 : 5 | |
256 -> 511 : 3 | |
512 -> 1023 : 5 | |
1024 -> 2047 : 6 | |
2048 -> 4095 : 4 | |
4096 -> 8191 : 1361 |****************************************|
8192 -> 16383 : 523 |*************** |
16384 -> 32767 : 3 | |
A bimodal distribution is now clearly visible. Most of the time, tasks were
able to run for 4-16ms before being descheduled (this is likely the quantum
length). Occasionally, tasks had to be descheduled a lot earlier -- possibly
because they competed for a shared lock.
If necessary, you can restrict the output to include only threads from a
particular process -- this helps reduce noise:
# ./cpudist.py -p $(pidof parprimes)
Tracing on-CPU time... Hit Ctrl-C to end.
^C
usecs : count distribution
0 -> 1 : 3 | |
2 -> 3 : 17 | |
4 -> 7 : 39 | |
8 -> 15 : 52 |* |
16 -> 31 : 43 | |
32 -> 63 : 12 | |
64 -> 127 : 13 | |
128 -> 255 : 0 | |
256 -> 511 : 1 | |
512 -> 1023 : 11 | |
1024 -> 2047 : 15 | |
2048 -> 4095 : 41 | |
4096 -> 8191 : 1134 |************************ |
8192 -> 16383 : 1883 |****************************************|
16384 -> 32767 : 65 |* |
You can also ask for output at predefined intervals, and include timestamps for
easier interpretation. While we're at it, the -P switch will print a histogram
separately for each process:
# ./cpudist.py -TP 5 3
Tracing on-CPU time... Hit Ctrl-C to end.
03:46:51
pid = 0
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 1 |** |
4 -> 7 : 17 |********************************** |
8 -> 15 : 11 |********************** |
16 -> 31 : 20 |****************************************|
32 -> 63 : 15 |****************************** |
64 -> 127 : 9 |****************** |
128 -> 255 : 6 |************ |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 1 |** |
pid = 5068
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 1 |************* |
4 -> 7 : 3 |****************************************|
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 1 |************* |
03:46:56
pid = 0
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 1 |** |
4 -> 7 : 19 |****************************************|
8 -> 15 : 11 |*********************** |
16 -> 31 : 9 |****************** |
32 -> 63 : 3 |****** |
64 -> 127 : 1 |** |
128 -> 255 : 3 |****** |
256 -> 511 : 0 | |
512 -> 1023 : 1 |** |
pid = 5068
usecs : count distribution
0 -> 1 : 1 |******************** |
2 -> 3 : 0 | |
4 -> 7 : 2 |****************************************|
03:47:01
pid = 0
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 12 |******************************** |
8 -> 15 : 15 |****************************************|
16 -> 31 : 15 |****************************************|
32 -> 63 : 0 | |
64 -> 127 : 3 |******** |
128 -> 255 : 1 |** |
256 -> 511 : 0 | |
512 -> 1023 : 1 |** |
pid = 5068
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 1 |****** |
4 -> 7 : 6 |****************************************|
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 2 |************* |
This histogram was obtained while executing `dd if=/dev/zero of=/dev/null` with
fairly large block sizes.
You could also ask for an off-CPU report using the -O switch. Here's a
histogram of task block times while the system is heavily loaded:
# ./cpudist -O -p $(parprimes)
Tracing off-CPU time... Hit Ctrl-C to end.
^C
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 1 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 3 | |
64 -> 127 : 1 | |
128 -> 255 : 1 | |
256 -> 511 : 0 | |
512 -> 1023 : 2 | |
1024 -> 2047 : 4 | |
2048 -> 4095 : 3 | |
4096 -> 8191 : 70 |*** |
8192 -> 16383 : 867 |****************************************|
16384 -> 32767 : 141 |****** |
32768 -> 65535 : 8 | |
65536 -> 131071 : 0 | |
131072 -> 262143 : 1 | |
262144 -> 524287 : 2 | |
524288 -> 1048575 : 3 | |
As you can see, threads are switching out for relatively long intervals, even
though we know the workload doesn't have any significant blocking. This can be
a result of over-subscription -- too many threads contending over too few CPUs.
Indeed, there are four available CPUs and more than four runnable threads:
# nproc
4
# cat /proc/loadavg
0.04 0.11 0.06 9/147 7494
(This shows we have 9 threads runnable out of 147 total. This is more than 4,
the number of available CPUs.)
Finally, let's ask for a per-thread report and values in milliseconds instead
of microseconds:
# ./cpudist.py -p $(pidof parprimes) -mL
Tracing on-CPU time... Hit Ctrl-C to end.
tid = 5092
msecs : count distribution
0 -> 1 : 3 | |
2 -> 3 : 4 | |
4 -> 7 : 4 | |
8 -> 15 : 535 |****************************************|
16 -> 31 : 14 |* |
tid = 5093
msecs : count distribution
0 -> 1 : 8 | |
2 -> 3 : 6 | |
4 -> 7 : 4 | |
8 -> 15 : 534 |****************************************|
16 -> 31 : 12 | |
tid = 5094
msecs : count distribution
0 -> 1 : 38 |*** |
2 -> 3 : 5 | |
4 -> 7 : 5 | |
8 -> 15 : 476 |****************************************|
16 -> 31 : 25 |** |
tid = 5095
msecs : count distribution
0 -> 1 : 31 |** |
2 -> 3 : 6 | |
4 -> 7 : 10 | |
8 -> 15 : 478 |****************************************|
16 -> 31 : 20 |* |
tid = 5096
msecs : count distribution
0 -> 1 : 21 |* |
2 -> 3 : 5 | |
4 -> 7 : 4 | |
8 -> 15 : 523 |****************************************|
16 -> 31 : 16 |* |
tid = 5097
msecs : count distribution
0 -> 1 : 11 | |
2 -> 3 : 7 | |
4 -> 7 : 7 | |
8 -> 15 : 502 |****************************************|
16 -> 31 : 23 |* |
tid = 5098
msecs : count distribution
0 -> 1 : 21 |* |
2 -> 3 : 5 | |
4 -> 7 : 3 | |
8 -> 15 : 494 |****************************************|
16 -> 31 : 28 |** |
tid = 5099
msecs : count distribution
0 -> 1 : 15 |* |
2 -> 3 : 4 | |
4 -> 7 : 6 | |
8 -> 15 : 521 |****************************************|
16 -> 31 : 12 | |
It looks like all threads are more-or-less equally busy, and are typically
switched out after running for 8-15 milliseconds (again, this is the typical
quantum length).
USAGE message:
# ./cpudist.py -h
usage: cpudist.py [-h] [-O] [-T] [-m] [-P] [-L] [-p PID] [interval] [count]
Summarize on-CPU time per task as a histogram.
positional arguments:
interval output interval, in seconds
count number of outputs
optional arguments:
-h, --help show this help message and exit
-O, --offcpu measure off-CPU time
-T, --timestamp include timestamp on output
-m, --milliseconds millisecond histogram
-P, --pids print a histogram per process ID
-L, --tids print a histogram per thread ID
-p PID, --pid PID trace this PID only
examples:
cpudist # summarize on-CPU time as a histogram
cpudist -O # summarize off-CPU time as a histogram
cpudist 1 10 # print 1 second summaries, 10 times
cpudist -mT 1 # 1s summaries, milliseconds, and timestamps
cpudist -P # show each PID separately
cpudist -p 185 # trace PID 185 only