Prev: Suggestions needed
Next: A tool that suggests optimized logic for a piece of code/module/function
From: Andrew Poelstra on 12 Jan 2010 17:16 On 2010-01-12, cerr <ron.eggler(a)gmail.com> wrote: > On Jan 12, 2:07�pm, cerr <ron.egg...(a)gmail.com> wrote: >> On Jan 12, 2:02�pm, John Gordon <gor...(a)panix.com> wrote:> In <e304a543-a511-4c8a-9179-be00acaaf...(a)k17g2000yqh.googlegroups.com> cerr <ron.egg...(a)gmail.com> writes: >> >> > > I just saw that I got a SIGABRT - but that twice only... :o may this >> > > be a clue? >> > > How would my process be getting a SIGABRT? Any clues? :o >> >> > SIGABRT is raised by calling the abort() system call, or when an assert() >> > evaluates to false. >> >> > Does your program contain any abort() or assert() calls? >> >> Yes, there's abort() calls in a file called memwatch.c - I guess I >> should have a look at this one... > > Oh look there, it's a huge file and in the header it says: > > ** MEMWATCH.C > ** Nonintrusive ANSI C memory leak / overwrite detection > ** Copyright (C) 1992-2001 Johan Lindh > ** All rights reserved. > ** Version 2.67 > > o after all I probably sent myself these signals...Does anyone know > anything about this memwatch.c file - gotta make myself smart 1st :)? I don't know anything about memwatch.c but I bet if you added a printf before each abort() call, you'd be able to learn something about what's going wrong.
From: cerr on 12 Jan 2010 17:37 On Jan 12, 2:16 pm, Andrew Poelstra <apoels...(a)localhost.localdomain> wrote: > On 2010-01-12, cerr <ron.egg...(a)gmail.com> wrote: > > > > > > > On Jan 12, 2:07 pm, cerr <ron.egg...(a)gmail.com> wrote: > >> On Jan 12, 2:02 pm, John Gordon <gor...(a)panix.com> wrote:> In <e304a543-a511-4c8a-9179-be00acaaf...(a)k17g2000yqh.googlegroups.com> cerr <ron.egg....(a)gmail.com> writes: > > >> > > I just saw that I got a SIGABRT - but that twice only... :o may this > >> > > be a clue? > >> > > How would my process be getting a SIGABRT? Any clues? :o > > >> > SIGABRT is raised by calling the abort() system call, or when an assert() > >> > evaluates to false. > > >> > Does your program contain any abort() or assert() calls? > > >> Yes, there's abort() calls in a file called memwatch.c - I guess I > >> should have a look at this one... > > > Oh look there, it's a huge file and in the header it says: > > > ** MEMWATCH.C > > ** Nonintrusive ANSI C memory leak / overwrite detection > > ** Copyright (C) 1992-2001 Johan Lindh > > ** All rights reserved. > > ** Version 2.67 > > > o after all I probably sent myself these signals...Does anyone know > > anything about this memwatch.c file - gotta make myself smart 1st :)? > > I don't know anything about memwatch.c but I bet if you added a printf > before each abort() call, you'd be able to learn something about what's > going wrong. Yup, put a couple of syslog commands in there, but there's probably still something else going on cause i can't find any kill statements or anything... :(
From: Ersek, Laszlo on 12 Jan 2010 20:18 In article <cb714bed-32c7-4ed6-a3c7-b6f8a02c9c8f(a)j24g2000yqa.googlegroups.com>, cerr <ron.eggler(a)gmail.com> writes: > This GDB was configured as "i586-linux-uclibc". > [root(a)DEVNEMS logrecord]# ldd prs > libpthread.so.0 =3D> /lib/libpthread.so.0 (0xb7f58000) > libssl.so.0.9.7 =3D> /usr/lib/libssl.so.0.9.7 (0xb7f31000) > librt.so.0 =3D> /lib/librt.so.0 (0xb7f2f000) > libstdc++.so.6 =3D> /lib/libstdc++.so.6 (0xb7ebc000) > libm.so.0 =3D> /lib/libm.so.0 (0xb7eae000) > libgcc_s.so.1 =3D> /lib/libgcc_s.so.1 (0xb7ea6000) > libc.so.0 =3D> /lib/libc.so.0 (0xb7e5a000) > libcrypto.so.0.9.7 =3D> /usr/lib/libcrypto.so.0.9.7 (0xb7d8b000) > libdl.so.0 =3D> /lib/libdl.so.0 (0xb7d88000) > ld-uClibc.so.0 =3D> /lib/ld-uClibc.so.0 (0xb7f6d000) > > I'm not quite certain what this would tell us tho... :( At least it allows for some wild speculation :) First, memwatch: http://www.linkdata.se/memwatch I'd risk after a very superficial look at memwatch that it does some nifty signals hacking. If your program is multi-threaded, that's not very easy. The memwatch USING itself says: Is this stuff thread-safe? I doubt it. As of version 2.66, there is rudimentary support for threads, if you happen to be using Win32 or if you have pthreads. Define WIN32 or MW_PTHREADS to signify this fact. This will cause a global mutex to be created, and memwatch will lock it when accessing the global memory chain, but it's still far from certified threadsafe. Second, you use uclibc. I have no idea whether memwatch was developed for / tested with uclibc. I'd say try building the app with memwatch disabled. I don't know if uclibc ships its own pthreads implementation, but if so, it may use some signals internally. (At least before NPTL, glibc used LinuxThreads which utilized some realtime (queued) signals. Or so I remember.) Third, before you start the application in gdb, set breakpoints at pthread_kill(), kill(), and raise(). (You may need system library debug symbols for this.) Whenever you stop in one of them, get a backtrace. Some system library (eg. the pthreads implementation) might detect such a mess that it has no choice but to kill the process. Fourth, are you sure your kernel and syslog are configured for maximum verbosity? Did you check all syslog files, dmesg etc? Did the app always behave like this? Didn't you change platforms recently or so? Did you go multi-threaded recently? Good luck, lacos
From: guenther on 12 Jan 2010 21:43 On Jan 12, 10:25 am, sc...(a)slp53.sl.home (Scott Lurndal) wrote: > "guent...(a)gmail.com" <guent...(a)gmail.com> writes: .... > >There are situations under which the kernel will send SIGKILL to a > >process. Others have mentioned the Linux OOM killer; a more rarely > >seen one is if you have a CPU-time resource hard limit set (such as > >via the ulimit shell-builtin) then the kernel will send the process a > >SIGKILL when the limit is reached. > > I think the cpu hard limit sends SIGXCPU, not SIGKILL. SIGXCPU is sent when you reach the *soft* limit; SIGKILL when you read the hard limit. At least that's what the setrlimit() manpage and kernel sources say on the RHEL5 system I'm looking at. Philip Guenther
From: cerr on 13 Jan 2010 13:28
On Jan 12, 5:18 pm, la...(a)ludens.elte.hu (Ersek, Laszlo) wrote: > In article <cb714bed-32c7-4ed6-a3c7-b6f8a02c9...(a)j24g2000yqa.googlegroups..com>, cerr <ron.egg...(a)gmail.com> writes: > > > This GDB was configured as "i586-linux-uclibc". > > [root(a)DEVNEMS logrecord]# ldd prs > > libpthread.so.0 =3D> /lib/libpthread.so.0 (0xb7f58000) > > libssl.so.0.9.7 =3D> /usr/lib/libssl.so.0.9.7 (0xb7f31000) > > librt.so.0 =3D> /lib/librt.so.0 (0xb7f2f000) > > libstdc++.so.6 =3D> /lib/libstdc++.so.6 (0xb7ebc000) > > libm.so.0 =3D> /lib/libm.so.0 (0xb7eae000) > > libgcc_s.so.1 =3D> /lib/libgcc_s.so.1 (0xb7ea6000) > > libc.so.0 =3D> /lib/libc.so.0 (0xb7e5a000) > > libcrypto.so.0.9.7 =3D> /usr/lib/libcrypto.so.0.9.7 (0xb7d8b000) > > libdl.so.0 =3D> /lib/libdl.so.0 (0xb7d88000) > > ld-uClibc.so.0 =3D> /lib/ld-uClibc.so.0 (0xb7f6d000) > > > I'm not quite certain what this would tell us tho... :( > > At least it allows for some wild speculation :) aha, hehe :) > First, memwatch: > > http://www.linkdata.se/memwatch > > I'd risk after a very superficial look at memwatch that it does some > nifty signals hacking. If your program is multi-threaded, that's not > very easy. The memwatch USING itself says: > > Is this stuff thread-safe? > > I doubt it. As of version 2.66, there is rudimentary support > for threads, if you happen to be using Win32 or if you have > pthreads. Define WIN32 or MW_PTHREADS to signify this fact. > > This will cause a global mutex to be created, and memwatch > will lock it when accessing the global memory chain, but it's > still far from certified threadsafe. I'm using multiple threads, yes and i am using pthread on Linux. So the global mutexes would be able to lock-up these things but then again, it certainly is suspecious that I added another thread with lots of dynamic memory allocating when it starts receiving SIGKILLs.. > > Second, you use uclibc. I have no idea whether memwatch was developed > for / tested with uclibc. I'd say try building the app with memwatch > disabled. I don't know if uclibc ships its own pthreads implementation, > but if so, it may use some signals internally. (At least before NPTL, > glibc used LinuxThreads which utilized some realtime (queued) signals. > Or so I remember.) That would be the next step - to build it without memwatch, yes! As I wrote these replys from the bottom up I would luke to see if my breakpoints kick-in succesfully first - because if i receive SIGKILLs without the breakpoints kicking-in, it comes from something else we can say safely, right? > Third, before you start the application in gdb, set breakpoints at > pthread_kill(), kill(), and raise(). (You may need system library debug > symbols for this.) Whenever you stop in one of them, get a backtrace. > Some system library (eg. the pthreads implementation) might detect such > a mess that it has no choice but to kill the process. Well, i connected to the remote target, hit continue and ctrl-c-ed back to the (gdb) prpmt in order to add breakpoints like: (gdb) break pthread_kill() Function "pthread_kill()" not defined. Make breakpoint pending on future shared library load? (y or [n]) (gdb) break kill() Function "kill()" not defined. Make breakpoint pending on future shared library load? (y or [n]) (gdb) break raise() Function "raise()" not defined. Make breakpoint pending on future shared library load? (y or [n]) (gdb) continue Continuing. I have very little experience with gdb and hope i did this correctly...? > Fourth, are you sure your kernel and syslog are configured for maximum > verbosity? Did you check all syslog files, dmesg etc? I don't know the kernel but syslog-ng.conf only includes one filter: filter f_notice {not level (info); }; > Did the app always behave like this? Didn't you change platforms > recently or so? Did you go multi-threaded recently? No, but I added another thread to the app that's responsible for sending off the log lines rather than just using syslog-ng with a defined remote target because we wanna verify that the logserver is sending a layer 7 acknowledge back.. |