Assign-Unique-Id Problem
BobCook at gasou.edu 5/6/2003
Operating systems have a requirement to generate unique ids in a number of contexts. The id may need to be locally unique or unique in time and space. Microsoft's C++ product, for example, includes a "uuidgen" utility that generates 128-bit ids that are unique in space and time.
For this example, we consider the problem of generating uids from a fixed range of positive integers. The requirements are 1) that a given uid come from within the range, 2) that it not be allocated more than once without being freed, 3) that all integers in the range be allocated before returning an error, 4) that a freed integer must be in the allocated state, 5) that an integer cannot be freed more than once, 6) return 0 if no pid is currently available and 7) that any solution be SMP safe.
The following solution is from the Linux operating system kernel/fork.c © Linus Torvalds et al and is used to allocate unique process ids. The solution has the advantage that no auxiliary data structure is needed. The algorithm allocates a process id only if it is not already in use by other processes.
Much of the software that is written is "correct" in terms of responding correctly within the environment of the typical uses of that software. For example, a sqrt() function might be "correct" if it never tested for negative arguments and if it were used in an environment in which negative arguments never occurred. However, software can have a lifetime in some cases longer than its originators. What if requirements or the environment change? The dangerous assumption may be that "Joe's code never failed before". We aspire to a standard of absolute correctness.
First, a global spin lock is acquired then a read lock is acquired on the task list. The legal range for pids is 300 to 32767 inclusive. Every time that an attempted allocation matches an allocated value, the trial value is incremented and the search of the task list begins anew. An error is detected if the search cycles through all possible pids and returns to the starting id (beginpid). Don't worry about the unlikely() macro. It is just a hint for architectures that support compile-time branch prediction.
Under what conditions is the following algorithm correct?
83 /* Protects next_safe and last_pid. */
84 spinlock_t lastpid_lock = SPIN_LOCK_UNLOCKED;
85
86 static int get_pid(unsigned long flags)
87 {
89 struct task_struct *p;
90 int pid, beginpid;
91
92 if (flags & CLONE_PID)
93 return current->pid;
94
95 spin_lock(&lastpid_lock);
96 beginpid = last_pid;
97 if((++last_pid) & 0xffff8000) {
98 last_pid = 300; /* Skip daemons etc. */
100 }
104 read_lock(&tasklist_lock);
105 repeat:
106 for_each_task(p) {
107 if(p->pid == last_pid ||
108 p->pgrp == last_pid ||
109 p->tgid == last_pid ||
110 p->session == last_pid) {
112 if((++last_pid) & 0xffff8000)
113 last_pid = 300;
116 if(unlikely(last_pid == beginpid))
117 goto nomorepids;
118 goto repeat;
119 }
128 }
129 read_unlock(&tasklist_lock);
130 }
131 pid = last_pid;
132 spin_unlock(&lastpid_lock);
133
134 return pid;
135
136 nomorepids:
137 read_unlock(&tasklist_lock);
138 spin_unlock(&lastpid_lock);
139 return 0;
140 }The average-case performance of the algorithm can be improved (?) as a side-effect of each search by calculating a "hint" that identifies a next-safe-region that can be used for allocations without searching. Thus, the high cost of a search is amortized over a number of future allocations.
This algorithm presents quite a challenge to users who depend on testing as a confidence builder since it doesn't "kick in" until 32,437 processes have been created!
Under what conditions is the following algorithm correct? A separate, but also interesting question, is the impact on total performance? This algorithm generates a lot of useless memory writes to the static variable next_safe for failed ids. It would also be desirable to have pid, pgrp, tgid and session in the same cache line.
83 /* Protects next_safe and last_pid. */
84 spinlock_t lastpid_lock = SPIN_LOCK_UNLOCKED;
85
86 static int get_pid(unsigned long flags)
87 {
88 static int next_safe = PID_MAX;
89 struct task_struct *p;
90 int pid, beginpid;
91
92 if (flags & CLONE_PID)
93 return current->pid;
94
95 spin_lock(&lastpid_lock);
96 beginpid = last_pid;
97 if((++last_pid) & 0xffff8000) {
98 last_pid = 300; /* Skip daemons etc. */
99 goto inside;
100 }
101 if(last_pid >= next_safe) {
102 inside:
103 next_safe = PID_MAX;
104 read_lock(&tasklist_lock);
105 repeat:
106 for_each_task(p) {
107 if(p->pid == last_pid ||
108 p->pgrp == last_pid ||
109 p->tgid == last_pid ||
110 p->session == last_pid) {
111 if(++last_pid >= next_safe) {
112 if(last_pid & 0xffff8000)
113 last_pid = 300;
114 next_safe = PID_MAX;
115 }
116 if(unlikely(last_pid == beginpid))
117 goto nomorepids;
118 goto repeat;
119 }
120 if(p->pid > last_pid && next_safe > p->pid)
121 next_safe = p->pid;
122 if(p->pgrp > last_pid && next_safe > p->pgrp)
123 next_safe = p->pgrp;
124 if(p->tgid > last_pid && next_safe > p->tgid)
125 next_safe = p->tgid;
126 if(p->session > last_pid && next_safe > p->session)
127 next_safe = p->session;
128 }
129 read_unlock(&tasklist_lock);
130 }
131 pid = last_pid;
132 spin_unlock(&lastpid_lock);
133
134 return pid;
135
136 nomorepids:
137 read_unlock(&tasklist_lock);
138 spin_unlock(&lastpid_lock);
139 return 0;
140 }The following SPIN version of the get_pid function was developed using C preprocessor macros to try to keep the Promela code as close to the original C as possible. There are obvious difficulties if verification requires a language translation to discover errors and then another translation to insert the fixes into the original code. SPIN was never intended for the purpose of verifying C code so the following criticism is somewhat unfair. The task list had to be coded as an array, for example.
The test driver creates two threads that repetitively allocate pids. Unfortunately, the test "blows up" the SPIN runtime, which generates a spurious "too many processes" error.
#define get_pid_f(v,x) {int z; z=(run get_pid_p(x)); get_pid_q??eval(z),v;}
chan get_pid_q=[20] of {int,int};
#define return1(a,b) a##_q!_pid,(b)
#define for_each_task(p) {p=TS_N-1; do ::(p<=0)->break; ::else->
#define end_for_each_task(p) p--; od;}
static int next_safe = PID_MAX;
static function1(int, get_pid, unsigned long flags)
{ bool b;
ref_task_struct p;
int upid, beginpid;
if
::(flags & CLONE_PID)->return1(get_pid, ts[current].upid);
::else->skip;
fi;
spin_lock(adr(lastpid_lock));
beginpid = last_pid;
last_pid++;
if
::(last_pid & PID_MAX_TEST)->{
last_pid = 300; /* Skip daemons etc. */
b=true;
}
::else->b=false;
fi;
if
::b||(last_pid >= next_safe)->{
next_safe = PID_MAX;
read_lock(adr(tasklist_lock));
repeat:
for_each_task(p)
if
::(ts[p].upid == last_pid)||(ts[p].pgrp == last_pid)||(ts[p].tgid == last_pid)||(ts[p].session == last_pid)->{
last_pid++;
if
::(last_pid >= next_safe)->{
if
::(last_pid & PID_MAX_TEST)->last_pid = 300;
::else->skip;
fi;
next_safe = PID_MAX;
};
::else->skip;
fi;
if
::(unlikely(last_pid == beginpid))->
read_unlock(adr(tasklist_lock)); spin_unlock(adr(lastpid_lock));
return1(get_pid, 0);
::else->goto repeat;
fi;
};
else->skip;
fi;
if
::(ts[p].upid > last_pid && next_safe > ts[p].upid)->next_safe = ts[p].upid;
::else->skip;
fi;
if
::(ts[p].pgrp > last_pid && next_safe > ts[p].pgrp)->next_safe = ts[p].pgrp;
::else->skip;
fi;
if
::(ts[p].tgid > last_pid && next_safe > ts[p].tgid)->next_safe = ts[p].tgid;
::else->skip;
fi;
if
::(ts[p].session > last_pid && next_safe > ts[p].session)->next_safe = ts[p].session;
::else->skip;
fi;
end_for_each_task(p);
read_unlock(adr(tasklist_lock));
}
::else->skip;
fi;
upid = last_pid;
spin_unlock(adr(lastpid_lock));
return1(get_pid, upid);
}
/*********************************************************************/
proctype tester()
{ int i=0;
do ::(i<305)->get_pid_f(i,SIGCHLD); printf("fork return=%d\n",i); ::else->break; od;
}
init {
int i;
module_init1(fork_init, 800);
i=TS_N-1;
do ::(i>1)->ts[i].state=STATE_UNUSED; i--; ::else->break; od;
run tester(); run tester();
(false);
}
The final step was to recode the algorithm using the StarLite prototyping environment (http://bcook.cs.gasou.edu). The environment supports a "safe" VM runtime and a compiler for a subset of C++. The StarLite library has a concurrent programming and simulation package. The method was to augment the code with assertions and then to use the simulation clock to vary the time spent in critical sections. Multiple threads were also created to "stress" critical sections. The VM can "force" a context switch every few instructions, which is a level of concurrency that would be impossible to achieve on a physical machine. The resulting higher-confidence algorithm is listed below.
\#include <Help.h>
\#include <PROCESS.h>
\#include <iostream.h>
\#include <Simulation.h>
#define static
#define int long
#define unsigned
#define volatile
#define adr(a) a
#define unlikely(a) a
#define SPIN_LOCK_UNLOCKED 0L
#define SPIN_LOCK_LOCKED 1L
#define SPIN_LOCK_READ_LOCKED -1L
#define RLIM_NLIMITS 2
#define CLONE_PID 1L
#define TS_N 500
#define mm_struct int
#define uid_t long
#define pid_t long
#define ref_user_struct short
#define ref_task_struct short
#define linux_binfmt int
#define X Hold(1L)
#define _pid Self()
#define assert(x) Assert(x)
#define spin_init(a) a.b=SPIN_LOCK_UNLOCKED; a.id=Process(NULL);
#define spin_lock(x) assert(x.id!=_pid); for (;x.b!=SPIN_LOCK_UNLOCKED;){cout<<"spin\n";X;}; x.b=SPIN_LOCK_LOCKED; x.id=_pid;
#define read_lock(x) assert(x.id!=_pid); while (x.b>SPIN_LOCK_UNLOCKED){X;}; x.b--; x.id=_pid;
#define spin_unlock(x) assert((x.id==_pid)&&(x.b>0L)); x.b=SPIN_LOCK_UNLOCKED; x.id=Process(NULL);X;
#define read_unlock(x) assert(x.b<0L); x.b++; if (x.b==SPIN_LOCK_UNLOCKED) x.id=Process(NULL);X;
#define for_each_task(p) for(p=TS_N-1;p>=0;p--)
/*------------------------------------*/
/*
* This controls the default maximum pid allocated to a process
*/
#define PID_POWER 9L
#define PID_MAX ((1L<<PID_POWER)-1L)
#define PID_MAX_TEST (-PID_MAX-1L)
#define PID_DAEMON_MAX 300L
typedef struct{unsigned long rlim_cur; unsigned long rlim_max; }rlimit;
typedef struct {
/* offsets of these are hardcoded elsewhere - touch with care */
volatile long state; /* -1 unrunnable, 0 runnable, >0 stopped */
mm_struct mm;
int did_exec, swappable;
/* process credentials */
uid_t uid,euid,suid,fsuid;
ref_user_struct user;
/* limits */ rlimit rlim[RLIM_NLIMITS];
/* task state */
linux_binfmt binfmt;
pid_t pid,session,pgrp,tgid;
} task_struct;
typedef struct{
int b;
Process id;
}spinlock_t;
static task_struct ts[TS_N];
static ref_task_struct current;
static spinlock_t lastpid_lock;
static spinlock_t tasklist_lock;
static pid_t next_safe = PID_MAX, last_pid=PID_DAEMON_MAX;
static boolean get_pid(pid_t &target, unsigned long flags)
{ boolean b=false;
ref_task_struct p;
pid_t beginpid;
if ((flags & CLONE_PID)!=0L) {target=ts[current].pid; return true;};
spin_lock(adr(lastpid_lock));
beginpid = last_pid;
last_pid++;
if ((last_pid & PID_MAX_TEST)!=0L) {
last_pid = PID_DAEMON_MAX; /* Skip daemons etc. */
b=true;
};
if (!b&&(last_pid < next_safe)) {
target=last_pid; spin_unlock(adr(lastpid_lock)); return true;
};
next_safe = PID_MAX; b=true;
while(b) {
b=false;
read_lock(adr(tasklist_lock));
for_each_task(p) { X; cout<<long(Self())<<" "<<beginpid<<" "<<last_pid<<" "<<next_safe<<endl;
if ((ts[p].pid == last_pid)||(ts[p].pgrp == last_pid)||(ts[p].tgid == last_pid)||(ts[p].session == last_pid)) {
last_pid++;
if (last_pid >= next_safe) {
assert(beginpid>PID_DAEMON_MAX); /*otherwise infinite loop*/
if ((last_pid & PID_MAX_TEST)!=0L) last_pid = PID_DAEMON_MAX;
next_safe = PID_MAX;
};
if (unlikely(last_pid == beginpid)) {
next_safe=-1L;
read_unlock(adr(tasklist_lock)); spin_unlock(adr(lastpid_lock));
return false;
}
b=true;
break;
};
if ((ts[p].pid > last_pid) && (next_safe > ts[p].pid)) next_safe = ts[p].pid;
if ((ts[p].pgrp > last_pid) && (next_safe > ts[p].pgrp)) next_safe = ts[p].pgrp;
if ((ts[p].tgid > last_pid) && (next_safe > ts[p].tgid)) next_safe = ts[p].tgid;
if ((ts[p].session > last_pid) && (next_safe > ts[p].session)) next_safe = ts[p].session;
}; //for
read_unlock(adr(tasklist_lock));
}; //while
target = last_pid;
spin_unlock(adr(lastpid_lock));
return true;
};
short j=0;
void test()
{int i; short k,m;
for (;;) {m=j; j++; if (!get_pid(ts[m].pid, 0L)) break;
i=ts[m].pid;
cout<<long(Self())<<" "<<i<<endl; X;
for (k=TS_N-1; k>=0; k--) if ((ts[k].pid==i)&&(k!=m)) assert(false);
};
HaltProcess(Self());
}
void main()
{ int i;
assert(PID_POWER>1L);
assert((PID_DAEMON_MAX>=0L)&&(PID_MAX>PID_DAEMON_MAX));
assert((PID_MAX|PID_MAX_TEST)==-1L);
spin_init(adr(lastpid_lock));
spin_init(adr(tasklist_lock));
RunProcess(CreateProcess(test,NULL,0,NULL,0));
RunProcess(CreateProcess(test,NULL,0,NULL,0));
HaltProcess(Self());
};