Linux Sampler

From the INFO section of the manual page for ps on Linux :-

CPU usage is currently expressed as the percentage of time spent running during the entire lifetime of a process.
This is not ideal, and it does not conform to the standards that ps otherwise conforms to.

This means that recent changes in CPU utilisation by a process can become insignificant for long running processes. In short, Linux ps smooths out CPU.

Linux Sampler is inspired by the information on UNIX stack exchange.

It uses the /proc filesystem. It works by querying the uptime of the system, and process utime and stime values. An interval later it does this again. We can therefore see how much CPU each process has consumed. As we also know the change in uptime, we know what proportion of a CPU each process has been consuming. This process repeats.

As well as CPU, the resident size (ie: physical memory footprint) of each process is determined. This can be expressed in B, KB, MB, GB, TB, or as a percentage of total system physical memory.

The CPU and memory utilisation is aggregated and displayed per UNIX userid. Userids are displayed textually if possible, or as numeric UID value otherwise.

Databases typically want you to configure a number of huge pages, so the sampler looks to see how many of these pages are used and how many are free. If any huge pages are found, entries are output for hp_used and hp_free dummy userids.

The sampler also reads the system-wide non-idle CPU metric, and system-wide metrics reflecting the total number of pages used by processes. It works out if the numbers aggregated from all the processes add up to these system-wide totals, and if not, logs an additional entry in the name of nonproc for them. nonproc is short for "non-process-attributed". This hopes to catch processing associated with short-lived processes (which may be born and die between samples, and thus not be visible at the point of sampling).

The output is designed so that it can be compatible with TrendServer :-

2014 11 10 20 39 45
	root 1.5 13.3
	haldaemon 0.0 0.2
	avahi 0.0 0.1
	smmsp 0.0 0.1
	rtkit 0.0 0.0
	ak 3.5 10.9
	webuser 0.0 0.8
	gdm 0.0 0.2
	rpcuser 0.0 0.0
	dbus 0.0 0.1
	rpc 0.0 0.0
	nonproc 0.2 2.6

The sampler logs to stdout by default. It can be directed to a file of your choice. When the number of lines in the file exceeds a maximum (by some small number of lines), earlier lines are chopped off the top of the file, leaving a minimum number behind. In this way, no log size management tool is required.

Usage :-

usage: linuxsampler {flags}
flags: -i interval  sample every interval seconds (default 60)
       -f output    output to a file
       -s min max   lines in the file (default 1000 1100)
       -m units     memory units (matching [KMGTPE]?i?B or %)

To produce TrendServer friendly output, we might run :-

$ nohup ./linuxsampler -i 60 -f linuxsampler.log -s 200000 201000 &

Disclaimer: I am aware that the idea that process RSS sizes sum to approximate the physical memory used by processes is a bit flawed, as it doesn't take into account memory shared between processes. This could cause some over-reporting of memory use.


This documentation is written and maintained by Andy Key
andy.z.key@googlemail.com