From: Hongyi Zhao on
On Fri, 02 Apr 2010 08:33:51 -0500, Ed Morton <mortonspam(a)gmail.com>
wrote:

>You got your answer, so hopefully it's clear now that you don't ever want to
>direct your output to the same file you're reading. You can do this instead:
>
> cmd file > tmp && mv tmp file

Ed Morton and others here, thanks a lot for discussing on this issue.
I've got it by and large.

>
>wrt your script, though, it'd more commonly be written using "next" than
>comparing line numbers twice, e.g.:
>
>awk 'NR==FNR{a[$0]++;next} !a[$0]' file1 file2 > tmp && mv tmp file2

If the file1 is empty, the above code will fail to work, how can I
overcome this issue?

BR.
--
..: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.
From: Thomas 'PointedEars' Lahn on
Hongyi Zhao wrote:

> Ed Morton wrote:
>> wrt your script, though, it'd more commonly be written using "next" than
>> comparing line numbers twice, e.g.:
>>
>> awk 'NR==FNR{a[$0]++;next} !a[$0]' file1 file2 > tmp && mv tmp file2
>
> If the file1 is empty, the above code will fail to work, how can I
> overcome this issue?

[ -s file1 ] && ...

I would use diff | grep anyway. RTFM.


PointedEars
From: Ed Morton on
On 4/2/2010 9:08 AM, Hongyi Zhao wrote:
> On Fri, 02 Apr 2010 08:33:51 -0500, Ed Morton<mortonspam(a)gmail.com>
> wrote:
>
>> You got your answer, so hopefully it's clear now that you don't ever want to
>> direct your output to the same file you're reading. You can do this instead:
>>
>> cmd file> tmp&& mv tmp file
>
> Ed Morton and others here, thanks a lot for discussing on this issue.
> I've got it by and large.
>
>>
>> wrt your script, though, it'd more commonly be written using "next" than
>> comparing line numbers twice, e.g.:
>>
>> awk 'NR==FNR{a[$0]++;next} !a[$0]' file1 file2> tmp&& mv tmp file2
>
> If the file1 is empty, the above code will fail to work, how can I
> overcome this issue?

The most obvious way is probably:

awk 'FILENAME==ARGV[1]{a[$0]++;next} !a[$0]' file1 file2

Another way is:

awk 'NR==FNR{a[$0];next} {delete a[$0]} END{for (i in a) print i}' file2 file1

There are several other options...

Ed.
From: Ed Morton on
On 4/2/2010 9:25 AM, Thomas 'PointedEars' Lahn wrote:
> Hongyi Zhao wrote:
>I use the following code to obtain the lines existing file2 but not in file1,
>
>> Ed Morton wrote:
>>>
>>> awk 'NR==FNR{a[$0]++;next} !a[$0]' file1 file2> tmp&& mv tmp file2
<snip>
>
> I would use diff | grep anyway. RTFM.
>

I'm curious - what would that solution look like given the input files below?

$ cat file1
a
c
$ cat file2
c
a
b
$ awk 'NR==FNR{a[$0]++;next} !a[$0]' file1 file2
b

Regards,

Ed.
From: Eric on
On 2010-04-03, Ed Morton <mortonspam(a)gmail.com> wrote:
> On 4/2/2010 9:25 AM, Thomas 'PointedEars' Lahn wrote:
>> Hongyi Zhao wrote:
>>I use the following code to obtain the lines existing file2 but not in file1,
>>
>>> Ed Morton wrote:
>>>>
>>>> awk 'NR==FNR{a[$0]++;next} !a[$0]' file1 file2> tmp&& mv tmp file2
><snip>
>>
>> I would use diff | grep anyway. RTFM.
>>
>
> I'm curious - what would that solution look like given the input files below?
>
> $ cat file1
> a
> c
> $ cat file2
> c
> a
> b
> $ awk 'NR==FNR{a[$0]++;next} !a[$0]' file1 file2
> b
>
> Regards,
>
> Ed.

I've just reread the original question, and I guess it depends on why
you are doing it, but I would definitely consider

sort file1 > file1s
sort file2 > file2s
comm -13 file1s file2s

comm needs the files to be sorted, but maybe they are anyway.

Eric