Prev: DRM / radeon / KMS: Fix hibernation regression related to radeon PM (was: Re: [Regression, post-2.6.34] Hibernation broken on machines with radeon/KMS and r300)
Next: [PATCH] x86, UV: make kdump avoid stack dumps
From: Luca Tettamanti on 21 Jun 2010 16:10 On Fri, Jun 18, 2010 at 09:49:44AM -0500, Christoph Lameter wrote: > Can produce it with make-kpkg building a kernel. [...] > linux-2.6$ strace -p21561 > Process 21561 attached - interrupt to quit > semop(32768, {{0, -1, SEM_UNDO}}, 1 > > linux-2.6$ strace -p21751 > Process 21751 attached - interrupt to quit > semop(32768, {{0, -1, SEM_UNDO}}, 1 > > linux-2.6$ strace -p21792 > Process 21792 attached - interrupt to quit > semop(32768, {{0, -1, SEM_UNDO}}, 1 > > linux-2.6$ strace -p21793 > Process 21793 attached - interrupt to quit > semop(32768, {{0, -1, SEM_UNDO}}, 1 Ah! I was trying to understand what was going on with apache... I see the same symptoms with apache and prefork module: each child serves one request and then just hangs until it's recycled. Strace shows the same semop syscall. # ipcs ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status ------ Semaphore Arrays -------- key semid owner perms nsems 0x00000000 65536 www-data 600 1 ------ Message Queues -------- key msqid owner perms used-bytes messages # cat /proc/sysvipc/sem key semid perms nsems uid gid cuid cgid otime ctime 0 65536 600 1 33 33 0 0 1277149940 1277149903 # /tmp/getall 65536 -v getall <id> [-v] found 1 semaphores. 0: 0 (cnt 4 zcnt 0) Luca -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Manfred Spraul on 23 Jun 2010 12:30 Hi, I think I found it: Previously, queue.status was never IN_WAKEUP when the semaphore spinlock was held. The last patch changes that: Now the change from IN_WAKEUP to the final result code happens after the the semaphore spinlock is dropped. Thus a task can observe IN_WAKEUP even when it acquired the semaphore spinlock. As a result, semop() sometimes returned 1 (IN_WAKEUP) for a successful operation. Attached is a patch that should fix the bug. -- Manfred
From: Luca Tettamanti on 23 Jun 2010 15:20 On Wed, Jun 23, 2010 at 6:29 PM, Manfred Spraul <manfred(a)colorfullife.com> wrote: > Hi, > > I think I found it: > Previously, queue.status was never IN_WAKEUP when the semaphore spinlock was > held. > > The last patch changes that: > Now the change from IN_WAKEUP to the final result code happens after the the > semaphore spinlock is dropped. > Thus a task can observe IN_WAKEUP even when it acquired the semaphore > spinlock. > > As a result, semop() sometimes returned 1 (IN_WAKEUP) for a successful > operation. > > Attached is a patch that should fix the bug. Apache seems fine. Tested-by: Luca Tettamanti <kronos.it(a)gmail.com> thanks, Luca -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Lameter on 23 Jun 2010 16:30 On Wed, 23 Jun 2010, Manfred Spraul wrote: > Attached is a patch that should fix the bug. I have not seen the bug since I applied the fix.
From: Luca Tettamanti on 24 Jun 2010 15:30
On Wed, Jun 23, 2010 at 9:14 PM, Luca Tettamanti <kronos.it(a)gmail.com> wrote: > On Wed, Jun 23, 2010 at 6:29 PM, Manfred Spraul > <manfred(a)colorfullife.com> wrote: >> Hi, >> >> I think I found it: >> Previously, queue.status was never IN_WAKEUP when the semaphore spinlock was >> held. >> >> The last patch changes that: >> Now the change from IN_WAKEUP to the final result code happens after the the >> semaphore spinlock is dropped. >> Thus a task can observe IN_WAKEUP even when it acquired the semaphore >> spinlock. >> >> As a result, semop() sometimes returned 1 (IN_WAKEUP) for a successful >> operation. >> >> Attached is a patch that should fix the bug. > > Apache seems fine. Argh, "seems" was indeed appropriate. Manfred your patch does alleviate the problem but something is still wrong. I noticed (I'm developing an ajax heavy web app) that sometimes an apache worker hangs; I can reproduce the problem with ab (apache benchmark) and a high concurrency level (I'm testing with 100 and 10k requests, and I get only 2-5 dropped requests). This does not happen with 2.4.34. Any idea on how I can debug this further? Luca -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |