Prev: Upgraded my old Debian box to Kernel 2.6.32, but missing sensors datas and can't compile the latest stable NVIDIA driver.
Next: Logs and dumps for kernel panics to collect and analyze?
From: Yousuf Khan on 9 Mar 2010 18:43 Vlad_Inhaler wrote: > I would have no hesitation in creating a special partition for panic > dumps, hell - if standard Linux filesystems are that sensitive I'd > even make it VFAT or whatever else is necessary. > I have reproducible kernel hangs under a certain kind of load, they > are *not* temperature related and I have no way of working out what > the hell is going on. Oh, the machine is dual-boot and I don't have > these problems under XP. > > Going further into that here would be hijacking this thread, and I > have tried that before now anyway without success. > > Having some sensible way of taking dumps for further analysis would be > a really *good thing* - hell, I'd even put an additional old IDE drive > in there as a destination device if that was what it took. Sorry, but > that is a 'safety feature' I am not that happy with. Windows can do > it, mainframe OSs can do it . . . I'm not as familiar with Linux systems, at least in this case, but I have a background with Solaris systems, and kernel dumps are written to the swap partition prior to system restart. Then after restart, a process runs that detects the presence of a memory dump in the swap partition and writes it out as a file into the filesystem. The presumption being that whatever caused the kernel dump during the last session will not immediately affect the new session after the reboot. Also the idea behind writing to the swap partition rather than to the filesystem directly is that it's more likely that a bug will have affected the filesystem driver, but not the raw disk system driver. Yousuf Khan
From: Arno on 10 Mar 2010 08:20 In comp.sys.ibm.pc.hardware.storage David Brown <david.brown(a)hesbynett.removethisbit.no> wrote: > Rod Speed wrote: >> Ant wrote >>> Arno wrote >> >>>> On the other hand, the serial interface is simple, so console >>>> output, including error messages, will still be written to it. >>>> If you need that output, connect a different computer to >>>> the serial port, activate the serial console and capture >>>> its output. I have done this a number of times, mostly to >>>> try out experimental kernels on a cluster, but also to debug >>>> kernel panics. >> >>> Can I use my old serial external dial-up modem for this? >> > It should be possible if the connection was up and running in advance - > I doubt if you'd be able to get a new connection after a disaster. That is a modem function and you will. The PC will just not be sending any data after the crash and the modem will not store it. >> Nope, you need a serial cable between the PCs. >> > That's the best idea. It is. >> It would be a lot better if Linux allowed a dump to a USB stick if >> you are happy to risk the contents of the USB stick on a kernal panic. >> > It's the price you pay for flexibility - most of Linux doesn't know that > you have a USB stick attached. It's all just files. And there is the problem that after a kernel panic the device IDs are wrong and the wrong device gets written to. Writing to a serial line typically does not destroy anything and can be accomplished with a very simple and small pice of assembler code. Also the mapping ttyS<x> and the actual interface hardware adress is static, unlike disk drives and can only describe a hardware UART. In addition, the kernel command line gets patched into the kernel, and hence is immune to data memory area corruption. And there is one additional problem: The serial console cannot caus a kernel panix, the filesystem (needed to write that USB stick) can, so the USB stick is an incomplete solution anyways as it cannot reliably log filesystem panics. Arno -- Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno(a)wagner.name GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F ---- Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
From: Arno on 10 Mar 2010 08:29 In comp.sys.ibm.pc.hardware.storage Darren Salt <news(a)youmustbejoking.demon.cu.invalid> wrote: > I demand that Arno may or may not have written... >> In comp.sys.ibm.pc.hardware.storage Ant <ant(a)zimage.comant> wrote: >>> On 3/7/2010 8:56 AM PT, Yousuf Khan typed: >>>>>> HOWTO enable core-dumps - LinuxReviews >>>>>> http://en.linuxreviews.org/HOWTO_enable_core-dumps >>>>> Thanks. Isn't this for program crashes, not kernel panics? I wonder >>>>> why it was removed because I used to see those core files from crashes. >>>> You may want to ask in a Linux newsgroup for more details. >>> I am already am. ;) >> You don't need to, > What ? ask in a Linux newsgroup? ;-) > (No, I'm not going to not post this to c.o.l.h.) ;-) >> no disk access is possible after a kernel panic, hence no logging. The only >> thing you can do, is to look at the screen or to enable the serial console >> output and log that on another machine. > I normally use netconsole for that. > http://www.mjmwired.net/kernel/Documentation/networking/netconsole.txt Nice, I was not aware of this. Should read the documentation I post more carefully, the hint was actually in the excerpt from kernel-parameters.txt I quoted.... I guess this will not work if the ethernet chip driver causes the panic though. But at least that would also identity the problematic component. Arno -- Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno(a)wagner.name GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F ---- Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
From: Ant on 10 Mar 2010 10:10 On 3/9/2010 9:07 AM PT, Rod Speed typed: >>> On the other hand, the serial interface is simple, so console >>> output, including error messages, will still be written to it. >>> If you need that output, connect a different computer to >>> the serial port, activate the serial console and capture >>> its output. I have done this a number of times, mostly to >>> try out experimental kernels on a cluster, but also to debug >>> kernel panics. > >> Can I use my old serial external dial-up modem for this? > > Nope, you need a serial cable between the PCs. > > It would be a lot better if Linux allowed a dump to a USB stick if > you are happy to risk the contents of the USB stick on a kernal panic. Yes, I have no problems with a USB flash drive/stick. I can reformat. ;) -- "I love ants. Do they have uncles? Ha Ha!" --Elmo from Sesame Street (unknown episode) /\___/\ / /\ /\ \ Phil./Ant @ http://antfarm.ma.cx (Personal Web Site) | |o o| | Ant's Quality Foraged Links: http://aqfl.net \ _ / Nuke ANT from e-mail address: philpi(a)earthlink.netANT ( ) or ANTant(a)zimage.com Ant is currently not listening to any songs on his home computer.
From: Vlad_Inhaler on 10 Mar 2010 13:03
On Mar 9, 7:14 pm, Arno <m...(a)privacy.net> wrote: > In comp.sys.ibm.pc.hardware.storage Vlad_Inhaler <andrew.willi...(a)t-online.de> wrote: > > > > > And Linux can do it. It just dumps to console instead of disk and > this choice is resonable because fo data safety, albeit sometimes > inconvenient in cheap setups. (Nothing against cheap setups, but > they are a bit limited on the hardware side and that sometimes is > inconvenient.) > > You are supposed to have more than one of these boxes in one place > and then there is no issue. You can also use a number of > serial-over-internet devices to record logs. Or a laptop with > serial interface placed next to the offending machine. Or a modem. > Or a serial data recorder, for example the Logomatic v2 Serial > SD Datalogger (-> Google), which costs about 50 EUR. > > The cheapest solution is usually just a serial crossover cable to > the next box in the rack that is under your control. Remember > that this is a sercer OS we are talking about here, not an > MS single-user-no-network OS that has over the course of time > been heavily extended. > > Side note: With server PC hardware you get an IPMI console that > also gives you the output, so the comparison with big iron is not > fair. The serial console is the low-low-cost solution. > > I should also add that a "soft panic" (which is closest to a blue > screen) typically dumps to /var/log/messages. It is only a hard panic > that is limited to the console. A hard panic corresponds to a lockup > without blue screen on windows. > > Arno > > -- > Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: a...(a)wagner.name > GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F > ---- > Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans Nah, the NT family was designed to be on a network from the very start. When you say 'single-user-no-network' you are talking about 3.1. Even the Win95/98/ME line was expecting to be hooked up although the network support was just an add-on. I will have to take the time next week to study this area (dumping over serial interfaces). Of course, then I need to be able to understand the dump :-( Yousuf Khan's comment about how Solaris does it was very interesting. My day-job is on mainframes (not IBM) and when you boot one of them, they always ask if you want a dump of the previous session. That would be rather annoying here but it is a good starting point. Dumping after a previous crash landing would be useful, at least as an option which could be turned on in some way. |