From: Chris Ridd on
On 2009-12-29 19:41:42 +0000, maps said:

>>
>> Exactly what were you trussing here?  The zcat process,
>> the diff process, or both?
>
> trussed the entire command line (i.e. zcat and diff)

In that case you only trussed the zcat :-) You need to do something
like this instead to truss both ends of the pipe:

truss -o /tmp/lhs zcat foo.Z | truss -o /tmp/rhs whatever command

>
>
>> I'd like to see a truss of the diff, but I don't
>> think that was it.
>
> diff, per se, works and its only in this particular usage it fails. So
> I am not sure if trussing diff itself would help

The left hand side of the pipe is failing to write data, and the main
documented way that can happen is if there's something odd happening to
the process on the right hand side of the pipe. So you need to look
closely at that.

You'll probably want to use truss's -d option (perhaps a better
timestamping option?) on both invocations so you can correlate what's
happening at any given point.

--
Chris

From: maps on
ok heres the truss output after calling truss on both sides with -d
option:

Base time stamp: 1262120381.9711 [ Tue Dec 29 14:59:41 CST 2009 ]
0.0000 execve("/usr/bin/diff", 0xFFBFFABC, 0xFFBFFACC) argc = 3
0.0041 resolvepath("/usr/lib/ld.so.1", "/usr/lib/ld.so.1", 1023) = 16
0.0044 resolvepath("/usr/bin/diff", "/usr/bin/diff", 1023) = 13
0.0048 stat("/usr/bin/diff", 0xFFBFF880) = 0
0.0048 open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT
0.0052 stat("/opt/app/xxxxxx/ncr/tbuild/12.00.00.00/lib/libc.so.1",
0xFFBFF388) Err#2 ENOENT
0.0056 stat("/usr/lib/libc.so.1", 0xFFBFF388) = 0
0.0057 resolvepath("/usr/lib/libc.so.1", "/usr/lib/libc.so.1", 1023)
= 18
0.0064 open("/usr/lib/libc.so.1", O_RDONLY) = 3
0.0067 mmap(0x00010000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|
MAP_ALIGN, 3, 0) = 0xFF3B0000
0.0072 mmap(0x00010000, 802816, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|
MAP_ANON|MAP_ALIGN, -1, 0) = 0xFF280000
0.0076 mmap(0xFF280000, 703464, PROT_READ|PROT_EXEC, MAP_PRIVATE|
MAP_FIXED, 3, 0) = 0xFF280000
0.0078 mmap(0xFF33C000, 24496, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_FIXED, 3, 704512) = 0xFF33C000
0.0080 mmap(0xFF342000, 6720, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_FIXED|MAP_ANON, -1, 0) = 0xFF342000
0.0081 munmap(0xFF32C000, 65536) = 0
0.0086 memcntl(0xFF280000, 117696, MC_ADVISE, MADV_WILLNEED, 0, 0) =
0
0.0087 close(3) = 0
0.0089 stat("/opt/app/xxxxxx/ncr/tbuild/12.00.00.00/lib/libdl.so.1",
0xFFBFF388) Err#2 ENOENT
0.0094 stat("/usr/lib/libdl.so.1", 0xFFBFF388) = 0
0.0096 resolvepath("/usr/lib/libdl.so.1", "/usr/lib/libdl.so.1",
1023) = 19
0.0098 open("/usr/lib/libdl.so.1", O_RDONLY) = 3
0.0103 mmap(0xFF3B0000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|
MAP_FIXED, 3, 0) = 0xFF3B0000
0.0107 mmap(0x00010000, 8192, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|
MAP_ANON|MAP_ALIGN, -1, 0) = 0xFF3A0000
0.0111 mmap(0xFF3A0000, 2210, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFF3A0000
0.0115 close(3) = 0
0.0117 stat("/usr/platform/FJSV,GPUZC-L/lib/libc_psr.so.1",
0xFFBFF088) = 0
0.0122 resolvepath("/usr/platform/FJSV,GPUZC-L/lib/libc_psr.so.1", "/
usr/platform/FJSV,GPUZC-M/lib/libc_psr.so.1", 1023) = 44
0.0128 open("/usr/platform/FJSV,GPUZC-L/lib/libc_psr.so.1", O_RDONLY)
= 3
0.0133 mmap(0xFF3B0000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|
MAP_FIXED, 3, 0) = 0xFF3B0000
0.0136 munmap(0xFF3B2000, 24576) = 0
0.0138 close(3) = 0
0.0140 mmap(0x00000000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFF390000
0.0148 getustack(0xFFBFF6C4)
0.0152 getrlimit(RLIMIT_STACK, 0xFFBFF6BC) = 0
0.0155 getcontext(0xFFBFF4F8)
0.0158 setustack(0xFF3439B4)
0.0165 issetugid() = 0
0.0167 brk(0x00028CD0) = 0
0.0168 brk(0x0002ACD0) = 0
0.0172 stat("/xxxxx/temp/20091221.src_file.csv.new", 0x00028BA0) = 0
0.0178 fstat(0, 0x00028C28) Err#79 EOVERFLOW
0.0188 fstat64(2, 0xFFBFE308) = 0
0.0192 write(2, " d i f f : ", 6) = 6
0.0196 open("/opt/app/xxxxx/ncr/tbuild/12.00.00.00/msg/
SUNW_OST_OSLIB", O_RDONLY) Err#2 ENOENT
0.0199 open("/usr/lib/locale/C/LC_MESSAGES/SUNW_OST_OSLIB.mo",
O_RDONLY) Err#2 ENOENT
0.0206 write(2, " s t d i n", 5) = 5
0.0208 write(2, " : ", 2) = 2
0.0212 write(2, " V a l u e t o o l a".., 37) = 37
0.0216 write(2, "\n", 1) = 1
0.0220 _exit(2)
= 0
0.0231 brk(0x000E8FA8) = 0
0.0237 fstat64(3, 0xFFBFE9E8) = 0
0.0240 ioctl(3, TCGETA, 0xFFBFEACC) Err#25 ENOTTY
0.0246 read(3, "1F9D90 CDEB48113C6 M1E16".., 8192) = 2206
0.0248 ioctl(1, TCGETA, 0xFFBFEA04) Err#22 EINVAL
0.0250 fstat64(1, 0xFFBFEA78) = 0
0.0252 brk(0x000E8FA8) = 0
0.0253 brk(0x000EAFA8) = 0
0.0254 fstat64(1, 0xFFBFE920) = 0
0.0258 read(3, 0x000E61CC, 8192) = 0
0.0259 write(1, " C o m p a n y , S t o r".., 3748) = 3748
0.0261 llseek(3, 0, SEEK_CUR) = 2206
0.0262 _exit(0)
From: maps on
0.0178 fstat(0, 0x00028C28) Err#79
EOVERFLOW

This is probably the root cause of the issue. But then we already had
guessed it earlier; so how can this problem be resolved now ?
From: Darren Dunham on
On Dec 29, 1:10 pm, maps <mapsiddi...(a)gmail.com> wrote:
> 0.0178 fstat(0, 0x00028C28)                            Err#79
> EOVERFLOW
>
> This is probably the root cause of the issue. But then we already had
> guessed it earlier; so how can this problem be resolved now ?

EOVERFLOW
The file size in bytes or the number of blocks allo-
cated to the file or the file serial number cannot be
represented correctly in the structure pointed to by
buf.


I wouldn't expect this failure on a pipe which doesn't have a size or
a serial number. I would expect it on a "large" file, but not how
you're using it.

Given that this used to work and now doesn't in more than one case,
and that the error doesn't make sense to me, I wonder if something got
screwed up on the system. Seems very odd to me.

--
Darren
From: Chris Ridd on
On 2009-12-29 21:59:49 +0000, Darren Dunham said:

> On Dec 29, 1:10 pm, maps <mapsiddi...(a)gmail.com> wrote:
>> 0.0178 fstat(0, 0x00028C28)                           Err#79
>> EOVERFLOW
>>
>> This is probably the root cause of the issue. But then we already had
>> guessed it earlier; so how can this problem be resolved now ?
>
> EOVERFLOW
> The file size in bytes or the number of blocks allo-
> cated to the file or the file serial number cannot be
> represented correctly in the structure pointed to by
> buf.
>
>
> I wouldn't expect this failure on a pipe which doesn't have a size or
> a serial number. I would expect it on a "large" file, but not how
> you're using it.

It would appear in this truss that diff is opening the largefile
"/xxxxx/temp/20091221.src_file.csv.new".

> Given that this used to work and now doesn't in more than one case,
> and that the error doesn't make sense to me, I wonder if something got
> screwed up on the system. Seems very odd to me.

Me too. I'd repeat the trusses on the original pipe sequence which
didn't involve diff (IIRC).

--
Chris