From: Richard B. Gilbert on 30 Dec 2009 16:28 maps wrote: >> Do I assume that 0.0178 is a line number rather than a part of the fstat >> call? Where did the value 0x00028C28 come from? I assume it's a >> pointer to something, but what? > > thats the timestamp from truss output; it wasnt a part of the fstat > call. > >> I really don't want to try to go back to the beginning of this thread in >> order to make sense of your post. Try posting a "reproducer"; e.g. >> reproduce the error with fewer than, say, fifteen lines of code. > > Well this error didnt come up while executing a code. It started > appearing all of a sudden on one of our production servers whenever we > used a pipe ( | ). In the specific case I quoted above, it occurred in > the following manner : > > zcat foo.txt.Z | diff foo.txt - > diff: stdin: value too large for defined data type. > > -maps. What has changed since it last worked? O/S upgrades? Patches installed? Different hardware platform? If you don't use "change control", problems like this are the reason why you should! I thought change control was a PITA when my employers first introduced it but I've seen the advantages. Briefly: before making any change to the hardware, firmware, software, or operating procedures, you document exactly what you are going to do and how you plan to back out the change if it causes problems.
From: maps on 30 Dec 2009 16:37 > What has changed since it last worked? O/S upgrades? Patches > installed? Different hardware platform? None to my knowledge. > If you don't use "change control", problems like this are the reason why > you should! I thought change control was a PITA when my employers first > introduced it but I've seen the advantages. Briefly: before making any > change to the hardware, firmware, software, or operating procedures, you > document exactly what you are going to do and how you plan to back out > the change if it causes problems. Oh we are quite sound in this aspect. Trust me, we have so many processes that they do become PITA (good abbrn by the way, lol) and I really mean it. Coming back to the problem; I was wondering where does fstat get invoked from ? is it present in libc.so ? One of our admins suggested that this might be due to a 64-bit library getting replaced by a 32- bit one. But the last modification timestamp on all of the files under suspicion look far too old to suggest that possibility. -maps.
From: Richard B. Gilbert on 30 Dec 2009 16:50 maps wrote: >> What has changed since it last worked? O/S upgrades? Patches >> installed? Different hardware platform? > > None to my knowledge. > >> If you don't use "change control", problems like this are the reason why >> you should! I thought change control was a PITA when my employers first >> introduced it but I've seen the advantages. Briefly: before making any >> change to the hardware, firmware, software, or operating procedures, you >> document exactly what you are going to do and how you plan to back out >> the change if it causes problems. > > Oh we are quite sound in this aspect. Trust me, we have so many > processes that they do become PITA (good abbrn by the way, lol) and I > really mean it. > > Coming back to the problem; I was wondering where does fstat get Can't help you there! The program that was executing at the time of the failure is the guilty program. If it's part of the O/S you need to talk to Sun about it. If it's something you bought from a third party, you need to talk to the vendor. If it's home made you need to get a "loader map" for the program to be able to pin down exactly what's going on.
From: Chris Ridd on 31 Dec 2009 02:28 On 2009-12-30 19:18:52 +0000, maps said: >> >> Get another system. Try the same commands there. If it works, >> something on your current system is screwed up. Consider >> reinstalling. > > Works on other systems. Interestingly on one of the other system fstat > with the same parameters works. I tried this on another server having > solaris 10 and the command runs just fine. Are you using exactly the same input files on each system? As it looks like one file's now larger than 32-bits on the problem system, you need to keep all the input the same when you're testing. -- Chris
From: Casper H.S. Dik on 31 Dec 2009 11:17
maps <mapsiddiqui(a)gmail.com> writes: >> Do I assume that 0.0178 is a line number rather than a part of the fstat >> call? =A0Where did the value 0x00028C28 come from? =A0I assume it's a >> pointer to something, but what? >thats the timestamp from truss output; it wasnt a part of the fstat >call. >> I really don't want to try to go back to the beginning of this thread in >> order to make sense of your post. =A0Try posting a "reproducer"; e.g. >> reproduce the error with fewer than, say, fifteen lines of code. >Well this error didnt come up while executing a code. It started >appearing all of a sudden on one of our production servers whenever we >used a pipe ( | ). In the specific case I quoted above, it occurred in >the following manner : >zcat foo.txt.Z | diff foo.txt - >diff: stdin: value too large for defined data type. Since a pipe is a file with a small number of bytes, the only possible issue is the dev number of the pipe. Is this a 64 bit and is the system up for quite some time? Casper -- Expressed in this posting are my opinions. They are in no way related to opinions held by my employer, Sun Microsystems. Statements on Sun products included here are not gospel and may be fiction rather than truth. |