Prev: Datasetp Retain Question
Next: Outputting in SQL
From: Ilan Benamara on 11 Mar 2010 17:25 Hello to all, This is my first time posting. My dataset has multiple lines per observation a and ounce i deleted duplicate lines(for ALL variables) i would to add a colomn to each observation saying how many variables were non-equal hence not excluded by the NODUP option. Here's an example of this scenario: id var1 var2 var3 var4 NumberDiff 1 same Diff same 1 --->Only var2 had same value for observation1 on two lines 1___________________________ 2 Diff Diff 2 --->Only var3 and 4 had same value for observation2 on two lines 2 I hope it is clear,let me know. Thanks to all in advance
From: Sierra Information Services on 11 Mar 2010 18:04 Of the top of my (rapidly greying) head, I don't think you'll get what you want using PROC SORT. If you use PROC FREQ with an OUT= option, you can create a data set that has one line per unique observation value and a count of how many observations have that value. Here's an example: /* start */ data mydata; do obs = 1 to 20; if 1 <= obs <= 3 then category = 'A'; else if 4 <= obs <=4 then category = 'B'; else if 5 <= obs <= 12 then category = 'C'; else if 13 <= obs <= 15 then category = 'D'; else category = 'E'; output; end; run; proc freq data=mydata; tables category/ noprint out=new(drop=percent); title 'count number of times each valye of category occurs in data set mydata'; run; options nodate nonumber nocenter; proc print data=new; title 'count number of times each value of category occurs in data set mydata'; run; /* end */ Temporary data set new may have what you want. The variable COUNT, automatically created by PROC FREQ, shows how many times a particular value of CATEGORY is found in data set MYDATA. I hope this helps. Andrew Karp Sierra Information Services http://www.sierrainformation.com On Mar 11, 2:25�pm, Ilan Benamara <ilan.benam...(a)gmail.com> wrote: > Hello to all, > > This is my first time posting. > > My dataset has multiple lines per observation a and ounce i deleted > duplicate lines(for ALL variables) i would to add a colomn to each > observation saying how many variables were non-equal hence not > excluded by the NODUP option. Here's an example of this scenario: > > id �var1 � � var2 � var3 �var4 NumberDiff > 1 �same � �Diff � � same � � � � � � 1 � � � � � �--->Only var2 had > same value for observation1 on two lines > 1___________________________ > 2 � � � � � � � � � � � �Diff �Diff � � � � �2 � � � � � --->Only var3 > and 4 had same value for observation2 on two lines > 2 > > I hope it is clear,let me know. > > Thanks to all in advance
From: Sierra Information Services on 11 Mar 2010 18:06 Well, I just re-read your post and am not sure if what I am offering is what you really need. Have you looked at PROC COMPARE, perhaps? Sorry! Andrew On Mar 11, 3:04�pm, Sierra Information Services <sfbay0...(a)aol.com> wrote: > Of the top of my (rapidly greying) head, I don't think you'll get what > you want using PROC SORT. > > If you use PROC FREQ with an OUT= option, you can create a data set > that has one line per unique observation value and a count of how many > observations have that value. > > Here's an example: > > /* start */ > data mydata; > � do obs = �1 to 20; > � � �if 1 <= obs <= 3 then category = 'A'; > � � � � �else > � � � � �if 4 <= obs <=4 then category = 'B'; > � � � � �else > � � � � �if 5 <= obs <= 12 then category = 'C'; > � � � � �else > � � � � �if 13 <= obs <= 15 then category = 'D'; > � � � � �else category = 'E'; > � � � � � � output; > � � � � � � � � � �end; > � � � � � �run; > > proc freq data=mydata; > tables category/ noprint out=new(drop=percent); > title 'count number of times each valye of category occurs in data set > mydata'; > run; > > options nodate nonumber nocenter; > proc print data=new; > title 'count number of times each value of category occurs in data set > mydata'; > run; > > /* end */ > > Temporary data set new may have what you want. �The variable COUNT, > automatically created by PROC FREQ, shows how many times a particular > value of CATEGORY is found in data set MYDATA. > > I hope this helps. > > Andrew Karp > Sierra Information Serviceshttp://www.sierrainformation.com > > On Mar 11, 2:25 pm, Ilan Benamara <ilan.benam...(a)gmail.com> wrote: > > > > > Hello to all, > > > This is my first time posting. > > > My dataset has multiple lines per observation a and ounce i deleted > > duplicate lines(for ALL variables) i would to add a colomn to each > > observation saying how many variables were non-equal hence not > > excluded by the NODUP option. Here's an example of this scenario: > > > id var1 var2 var3 var4 NumberDiff > > 1 same Diff same 1 --->Only var2 had > > same value for observation1 on two lines > > 1___________________________ > > 2 Diff Diff 2 --->Only var3 > > and 4 had same value for observation2 on two lines > > 2 > > > I hope it is clear,let me know. > > > Thanks to all in advance- Hide quoted text - > > - Show quoted text -
|
Pages: 1 Prev: Datasetp Retain Question Next: Outputting in SQL |