From: John Kelly on 20 Jun 2010 16:17 I found some introductory material on the reader writer problem and semaphores at: http://greenteapress.com/semaphores/ and I want to code a reader writer solution in C, using POSIX semaphores on Linux (for multiple processes, not threads). In his book, Downey mentions weak and strong semaphores. Anyone know if POSIX semaphores are weak or strong on Linux? -- Web mail, POP3, and SMTP http://www.beewyz.com/freeaccounts.php
From: John Kelly on 21 Jun 2010 14:37 On Sun, 20 Jun 2010 20:17:17 +0000, John Kelly <jak(a)isp2dial.com> wrote: > Anyone know if POSIX semaphores are weak or strong on Linux? I found some Apache docs http://httpd.apache.org/docs/2.0/misc/perf-tuning.html that say Apache implements a serialization mutex with flock(), sysvsem, or posixsem -- but if using POSIX semaphores: > The semaphore ownership is not recovered if a thread in the process > holding the mutex segfaults, resulting in a hang of the web server. Ugh. So I guess semaphores are better for multi threads in a single process, than they are for multi process apps. If a process is killed or dies in a critical section, a semaphore can remain in a locked state. I hear SYSV semaphores have a rollback capability in case of abnormal termination, but I also hear they have some undesirable qualities too, such as performance and system limitations. So if Apache prefers flock(), I wondered why not use it in place of POSIX semaphores. It seems that flock() provides an easy solution to the reader writer problem: READER WRITER EX lock gate EX lock gate SH lock data EX lock data unlock gate unlock gate read data write data unlock data unlock data EX == flock() EXCLUSIVE SH == flock() SHARED "gate" and "data" are lockfiles, say in /tmp This prevents writer starvation; while a writer waits for its EX data lock, the gate is still locked, and no other readers or writers can get thru the gate until the writer gets its EX lock and unlocks the gate. If a process dies or is killed, the kernel cleans up its file locks, and I don't have to worry about them. It seems the only other question is performance, semaphores vs. flock(). A quick test of lock/unlock with flock() ran 770,000 loops per second; a loop of post/wait with semaphores ran 2,700,000 loops per second, so the semaphore was about 4 times faster. But considering that semaphores can leave a multi process app hung, I think flock() performance will be fine for my purposes. I also compared the two on a faster server box which runs a different linux kernel version; flock ran about the same speed, but the semaphore loop ran an order of magnitude faster. Strange. Nevertheless, I think I'll use flock(), and let the kernel worry about lock cleanup. -- Web mail, POP3, and SMTP http://www.beewyz.com/freeaccounts.php
From: William Ahern on 21 Jun 2010 16:07 John Kelly <jak(a)isp2dial.com> wrote: <snip> > It seems the only other question is performance, semaphores vs. flock(). flock() will be slower because it's a pure kernel interface, whereas the fast-path for semaphore and mutex implementations can be done all in userspace. Now, perhaps if someone reimplemented Linux's flock() with a vsyscall.... <snip> > Nevertheless, I think I'll use flock(), and let the kernel worry about > lock cleanup. If it becomes a problem you can always optimize later. You can compose something like a read/write semaphore using process-shared POSIX mutex or barrier interfaces. However, even here you're not necessarily guaranteed 100% reliability; though some implementations are capable of releasing a process' locks, it often depends on the process' memory not having a corrupt heap.
From: John Kelly on 22 Jun 2010 10:45 On Mon, 21 Jun 2010 13:07:46 -0700, William Ahern <william(a)wilbur.25thandClement.com> wrote: >flock() will be slower because it's a pure kernel interface, whereas the >fast-path for semaphore and mutex implementations can be done all in >userspace. > >Now, perhaps if someone reimplemented Linux's flock() with a vsyscall.... I thought a vsyscall was just a faster interface to a syscall. Is that what you mean? >> Nevertheless, I think I'll use flock(), and let the kernel worry about >> lock cleanup. >If it becomes a problem you can always optimize later. Should not be a problem in my app. I have Apache processes reading a MySQL database, but they don't remember what they learned, they ask the same question again and again. Instead, I want them to look for the answer in a BerkeleyDB first. If the answer is not there, they will ask a special server, via sockets, to find the answer in the MySQL database. The special server then writes the MySQL answer to the BerkeleyDB, and notifies the Apache process to look again. So as answers accumulate in the BerkeleyDB "directory," the number of MySQL lookups will approach zero. The BerkeleyDB will provide a quicker "process local" lookup without context switch. The reader locks will still occur on each http request, but the flock() calls will be insignificant in relation to my overall server load. I just wanted to have a rough idea of the relative performance of flock() vs. semaphores, to avoid potential performance pitfalls. >You can compose something like a read/write semaphore using process-shared >POSIX mutex or barrier interfaces. However, even here you're not necessarily >guaranteed 100% reliability Trying to make semaphores 100% reliable for a multi process app may not be impossible, but it looks like more work than I want to do. -- Web mail, POP3, and SMTP http://www.beewyz.com/freeaccounts.php
From: Ben Bacarisse on 27 Jun 2010 22:20
John Kelly <jak(a)isp2dial.com> writes: > [...] It seems that flock() provides an easy solution to > the reader writer problem: > > READER WRITER > > EX lock gate EX lock gate > SH lock data EX lock data > unlock gate unlock gate > read data write data > unlock data unlock data > > > EX == flock() EXCLUSIVE > SH == flock() SHARED > > "gate" and "data" are lockfiles, say in /tmp > > This prevents writer starvation; while a writer waits for its EX data > lock, the gate is still locked, and no other readers or writers can get > thru the gate until the writer gets its EX lock and unlocks the gate. From a cursory reading of what flock guarantees, I don't think that writer starvation is prevented. Readers piling up on the exclusive gate lock can prevent a writer getting to the point where it waits for its data lock. I don't see any guarantee that a process that has been waiting a long time will be granted a lock ahead of processes that have waited for less time but I did not read the kernel documentation -- only the man page. <snip> -- Ben. |