duplicate records data into single row [SAS]

Prev: Partial Nesting in Proc Mixed - Follow Up
Next: Proc SQL 'Or' statement use

From: "Data _null_;" on 26 Jan 2010 22:07

What makes MAX the preferred value? Why is that relevant? You solve
a different problem that needs more specs.

On 1/26/10, Sigurd Hermansen <HERMANS1(a)westat.com> wrote:
> Clever. My only qualm would be uncertainty about results in slightly more complex situations. The SQL GROUP BY alternative removes some of that uncertainty when some columns have multiple non-missing values:
> data have;
> input (ID VI)(:$1.) Var1-Var3;
> cards;
> 1 1 . 23 .
> 1 1 21 . .
> 1 1 20 . 24
> ;;;;
> run;
> data flat;
> update have(obs=0) have;
> by id vi;
> run;
> proc print;
> run;
> proc sql;
> select distinct ID,VI,max(Var1) as Var1,max(Var2) as Var2,max(Var3) as Var3
> from have
> group by ID,VI
> ;
> quit;
>
> -----Original Message-----
> From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of data _null_;
> Sent: Tuesday, January 26, 2010 12:08 PM
> To: SAS-L(a)LISTSERV.UGA.EDU
> Subject: Re: duplicate records data into single row
>
> On Jan 26, 11:03 am, Sravan Varma <clinicalprogrammer....(a)gmail.com>
> wrote:
> > Hi ,
> > I have a dataset with records like
> >
> > ID VI Var1 Var2 Var3
> > 1 1 . 23 .
> > 1 1 21 . .
> > 1 1 . .
> > 24
> >
> > I want this to be in one row something like
> >
> > ID VI Var1 Var2 Var3
> > 1 1 21 23 24
> >
> > Any suggestion on how to make this possible?
> >
> > Appreciate your time and help.
> >
> > Regards
> > Matt
> UPDATE statement....
>
> data have;
> input (ID VI)(:$1.) Var1-Var3;
> cards;
> 1 1 . 23 .
> 1 1 21 . .
> 1 1 . . 24
> ;;;;
> run;
> data flat;
> update have(obs=0) have;
> by id vi;
> run;
> proc print;
> run;
>

From: Ian Whitlock on 26 Jan 2010 22:45

Sig,

You are right to have qualms. The two solutions produce different
results in general. The SQL/max solution produces maximums. The
UPDATE produces the last non-missing values. VAR1 is 21 in the first
case and 20 in the latter. There is a third possibility - the OP
could have asked for 21 because it occurs first in the given order.
In this case both solutions would be wrong due to inadequate
specification. The there are other possibilities - perhaps the
ceiling of the average is wanted.

Examples, help to clarify specs, but rarely can they replace them
without an understanding reader. This is a good multiple choice
question for competency of the test maker.

Ian Whitlock
===============

Date: Tue, 26 Jan 2010 21:59:35 -0500
From: Sigurd Hermansen <HERMANS1(a)WESTAT.COM>
Subject: Re: duplicate records data into single row
Comments: To: "data _null_;" <datanull(a)GMAIL.COM>

Clever. My only qualm would be uncertainty about results in slightly
more
complex situations. The SQL GROUP BY alternative removes some of that
uncertainty when some columns have multiple non-missing values:

data have;
input (ID VI)(:$1.) Var1-Var3;
cards;
1 1 . 23 .
1 1 21 . .
1 1 20 . 24
;;;;
run;
data flat;
update have(obs=0) have;
by id vi;
run;
proc print;
run;
proc sql;
select distinct ID,VI,max(Var1) as Var1,max(Var2) as Var2,
max(Var3) as Var3
from have
group by ID,VI
;
quit;

-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L] On Behalf Of data _null_;
Sent: Tuesday, January 26, 2010 12:08 PM
Subject: Re: duplicate records data into single row

On Jan 26, 11:03 am, Sravan Varma <clinicalprogrammer....(a)gmail.com>
wrote:
> Hi ,
> I have a dataset with records like
>
> ID VI Var1 Var2 Var3
> 1 1 . 23 .
> 1 1 21 . .
> 1 1 . .
> 24
>
> I want this to be in one row something like
>
> ID VI Var1 Var2 Var3
> 1 1 21 23 24
>
> Any suggestion on how to make this possible?
>
> Appreciate your time and help.
>
> Regards
> Matt
UPDATE statement....

data have;
input (ID VI)(:$1.) Var1-Var3;
cards;
1 1 . 23 .
1 1 21 . .
1 1 . . 24
;;;;
run;
data flat;
update have(obs=0) have;
by id vi;
run;
proc print;
run;

From: Sigurd Hermansen on 26 Jan 2010 23:16

The max() solution simply makes the solution predictable. It illustrates how a programmer might develop a solution and inquire Whether to specify a maximum, minimum, mean, or other preference. I suspect the person who posted the question isn't really asking how to collapse a specific three row table into a one row table (why not use Excel), but instead is looking for a more general SAS solution.

Ian rightly insists on a better statement of requirements. In many situations programmers have to encourage clients to look beyond data in hand.
S

-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Data _null_;
Sent: Tuesday, January 26, 2010 10:07 PM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: Re: duplicate records data into single row

What makes MAX the preferred value? Why is that relevant? You solve
a different problem that needs more specs.

On 1/26/10, Sigurd Hermansen <HERMANS1(a)westat.com> wrote:
> Clever. My only qualm would be uncertainty about results in slightly more complex situations. The SQL GROUP BY alternative removes some of that uncertainty when some columns have multiple non-missing values:
> data have;
> input (ID VI)(:$1.) Var1-Var3;
> cards;
> 1 1 . 23 .
> 1 1 21 . .
> 1 1 20 . 24
> ;;;;
> run;
> data flat;
> update have(obs=0) have;
> by id vi;
> run;
> proc print;
> run;
> proc sql;
> select distinct ID,VI,max(Var1) as Var1,max(Var2) as Var2,max(Var3) as Var3
> from have
> group by ID,VI
> ;
> quit;
>
> -----Original Message-----
> From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of data _null_;
> Sent: Tuesday, January 26, 2010 12:08 PM
> To: SAS-L(a)LISTSERV.UGA.EDU
> Subject: Re: duplicate records data into single row
>
> On Jan 26, 11:03 am, Sravan Varma <clinicalprogrammer....(a)gmail.com>
> wrote:
> > Hi ,
> > I have a dataset with records like
> >
> > ID VI Var1 Var2 Var3
> > 1 1 . 23 .
> > 1 1 21 . .
> > 1 1 . .
> > 24
> >
> > I want this to be in one row something like
> >
> > ID VI Var1 Var2 Var3
> > 1 1 21 23 24
> >
> > Any suggestion on how to make this possible?
> >
> > Appreciate your time and help.
> >
> > Regards
> > Matt
> UPDATE statement....
>
> data have;
> input (ID VI)(:$1.) Var1-Var3;
> cards;
> 1 1 . 23 .
> 1 1 21 . .
> 1 1 . . 24
> ;;;;
> run;
> data flat;
> update have(obs=0) have;
> by id vi;
> run;
> proc print;
> run;
>

First | Prev |
Pages: 1 2
Prev: Partial Nesting in Proc Mixed - Follow Up
Next: Proc SQL 'Or' statement use