Issue
How does Linux determine the next PID it will use for a process? The purpose of this question is to better understand the Linux kernel. Don't be afraid to post kernel source code. If PIDs are allocated sequentially how does Linux fill in the gaps? What happens when it hits the end?
For example if I run a PHP script from Apache that does a <?php print(getmypid());?>
the same PID will be printed out for a few minutes while hit refresh. This period of time is a function of how many requests apache is receiving. Even if there is only one client the PID will eventually change.
When the PID changes, it will be a close number, but how close? The number does not appear to be entirely sequential. If I do a ps aux | grep apache
I get a fair number of processes:
How does Linux choose this next number? The previous few PID's are still running, as well as the most recent PID that was printed. How does apache choose to reuse these PIDs?
Solution
The kernel allocates PIDs in the range of (RESERVED_PIDS, PID_MAX_DEFAULT). It does so sequentially in each namespace (tasks in different namespaces can have the same IDs). In case the range is exhausted, pid assignment wraps around.
Some relevant code:
Inside alloc_pid(...)
for (i = ns->level; i >= 0; i--) {
nr = alloc_pidmap(tmp);
if (nr < 0)
goto out_free;
pid->numbers[i].nr = nr;
pid->numbers[i].ns = tmp;
tmp = tmp->parent;
}
alloc_pidmap()
static int alloc_pidmap(struct pid_namespace *pid_ns)
{
int i, offset, max_scan, pid, last = pid_ns->last_pid;
struct pidmap *map;
pid = last + 1;
if (pid >= pid_max)
pid = RESERVED_PIDS;
/* and later on... */
pid_ns->last_pid = pid;
return pid;
}
Do note that PIDs in the context of the kernel are more than just int
identifiers; the relevant structure can be found in /include/linux/pid.h
. Besides the id, it contains a list of tasks with that id, a reference counter and a hashed list node for fast access.
The reason for PIDs not appearing sequential in user space is because kernel scheduling might fork a process in between your process' fork()
calls. It's very common, in fact.
Answered By - Michael Foukarakis