From: greg6363 on 26 Apr 2010 14:03 I know how to remove duplicates from a file by using the following code: proc sort in=dataset1 out=dataset2 nodupkey; by AccountNumber; run; But now, I have a situation where I have duplicate records but I want to keep the last record of the duplicate according to a particular variable. I can't seem to figure out how to code it. Anyone run across this situation before? Any assistance would be greatly appreciated. Thanks.
From: Reeza on 26 Apr 2010 14:22 On Apr 26, 11:03 am, greg6363 <gregtlaugh...(a)gmail.com> wrote: > I know how to remove duplicates from a file by using the following > code: > > proc sort in=dataset1 out=dataset2 nodupkey; > by AccountNumber; > run; > > But now, I have a situation where I have duplicate records but I want > to keep the last record of the duplicate according to a particular > variable. I can't seem to figure out how to code it. Anyone run > across this situation before? Any assistance would be greatly > appreciated. Thanks. can you sort so that record would be first? Then do the sort with no duprec ie by accountnumber field (descending)? I don't recall if the descending goes before or after the variable name at the moment.
From: Jim Groeneveld on 29 Apr 2010 06:16 Hi Greg, proc sort in=dataset1 out=dataset2; by AccountNumber; run; DATA Dataset2; SET Dataset2; by AccountNumber; IF (LAST.AccountNumber); run; On the other hand: what is the "last" record? How would PROC SORT sort? If you have some date or time variable as well, you should also use it: BY AccountNumber DateVar TimeVar; This forces the chronologically last record to be kept. Regards - Jim. -- Jim Groeneveld, Netherlands Statistician, SAS consultant http://jim.groeneveld.eu.tf greg6363 <gregtlaughlin(a)gmail.com> wrote: >I know how to remove duplicates from a file by using the following8code:13proc sort in=dataset1 out=dataset2 nodupkey;<by AccountNumber;7run;96But now, I have a situation where I have duplicate records but I want9to keep the last record of the duplicate according to a particular3variable. I can't seem to figure out how to code it. Anyone run8across this situation before? Any assistance would be greatly appreciated. Thanks.
From: Dav Vandenbroucke on 29 Apr 2010 17:17 On Mon, 26 Apr 2010 11:03:35 -0700 (PDT), greg6363 <gregtlaughlin(a)gmail.com> wrote: >But now, I have a situation where I have duplicate records but I want >to keep the last record of the duplicate according to a particular >variable. Sort the dataset in an order that will put the duplicate records you want to keep last. Then do something like: DATA want; SET have; BY sortVar; IF LAST.sortVar AND NOT FIRST.sortVar; RUN; That will keep the records only if they are duplicates. Dav Vandenbroucke davanden at cox dot net
|
Pages: 1 Prev: IF seems to be being ignored. Next: Expanding Results list in Window |