From: Ya Huang on 8 Mar 2010 14:10 So you want to set the flag to missing is SEROTYPE is missing? If so, a little change would do it: data ALLFILLED; input ID SAM SEROTYPE_1 SEROTYPE_9V; cards; 1 0.2 8.22 36.1 1 0.3 1.81 22.1 1 1.0 .41 4.8 1 2.0 . 3.2 1 3.0 .30 2.1 1 4.0 .23 2.0 1 5.0 .10 1.9 1 6.0 .03 0.9 2 0.2 38.1 100.3 2 0.3 25.81 95.2 2 1.0 22.3 66.2 2 2.0 21.5 54.2 2 3.0 18.3 49.5 2 4.0 6.2 31.1 2 5.0 6.1 23.3 2 6.0 3.1 22 ; RUN; proc sql; select *, max(case when SAM=3 AND SEROTYPE_1>.35 THEN 1 when SAM=3 AND SEROTYPE_1<.35 THEN 0 else . end) + serotype_1 - serotype_1 as CORR3_SER1 from ALLFILLED group by id order by id,sam ; ID SAM SEROTYPE_1 SEROTYPE_9V CORR3_SER1 --------------------------------------------------- 1 0.2 8.22 36.1 0 1 0.3 1.81 22.1 0 1 1 0.41 4.8 0 1 2 . 3.2 . 1 3 0.3 2.1 0 1 4 0.23 2 0 1 5 0.1 1.9 0 1 6 0.03 0.9 0 2 0.2 38.1 100.3 1 2 0.3 25.81 95.2 1 2 1 22.3 66.2 1 2 2 21.5 54.2 1 2 3 18.3 49.5 1 2 4 6.2 31.1 1 2 5 6.1 23.3 1 2 6 3.1 22 1 Note, I change obs=4 SEROTYPE_=. for testing. On Mon, 8 Mar 2010 13:58:22 -0500, Katy Seib <uralva(a)GMAIL.COM> wrote: >Very close, when I try to add a line to keep missing data =missing, I get >the same results (CORR3_SER1=0 INSTEAD OF .). Lot to learn about sql. I >added a third person below to simulate missing data (which there is plenty >of). I'm sure it's a syntax problem on my end..... > >data ALLFILLED; >input ID SAM SEROTYPE_1 SEROTYPE_9V; >cards; >1 0.2 8.22 36.1 >1 0.3 1.81 22.1 >1 1.0 .41 4.8 >1 2.0 .33 3.2 >1 3.0 .30 2.1 >1 4.0 .23 2.0 >1 5.0 .10 1.9 >1 6.0 .03 0.9 >2 0.2 38.1 100.3 >2 0.3 25.81 95.2 >2 1.0 22.3 66.2 >2 2.0 21.5 54.2 >2 3.0 18.3 49.5 >2 4.0 6.2 31.1 >2 5.0 6.1 23.3 >2 6.0 3.1 22 >3 0.2 38.1 100.3 >3 0.3 25.81 95.2 >3 1.0 22.3 66.2 >3 2.0 21.5 54.2 >3 3.0 . . >3 4.0 6.2 31.1 >3 5.0 6.1 23.3 >3 6.0 3.1 22 >; >RUN; >proc sql; >select *, max(case when SAM=3 AND SEROTYPE_1>.35 THEN 1 > when SAM=3 AND SEROTYPE_1<.35 THEN 0 > when sam=3 and serotype_1=. THEN . > else . end) as CORR3_SER1 >from ALLFILLED >group by id >order by id,sam >; > > > >On Mon, Mar 8, 2010 at 12:14 PM, Katy Seib <uralva(a)gmail.com> wrote: > >> **Intro >> First, thanks To Ron for posting the SAS-L wiki this morning- very helpful! >> I too am relatively new to posting at the listserv but have enjoyed >> reading >> for a while. I'm a guest researcher at CDC and a grad student finishing up >> my thesis which is a longitudinal study - much of which is beyond the scope >> of my formal SAS education - I've had fun finding resources like this, UCLA >> and various SAS tips across the internet. I think my question today is >> relatively simple but a problem for me nonetheless. >> **Problem >> I have been calculating variables and have a LONG but effective algorithm >> for "filling in" variables that are constant across an observation. It's >> tedious and I still worry about errors. I know there must be a simpler way >> to fill in this information AS it's calculated. Below I have included an >> example and sample data. The SAM variable is a crude time variable and in >> reality there are 23 serotypes. The data step "correlate" gives me the >> yes/no answer for sam=3 only but I need it for all sam (the question being >> answered by calculating this variable is: "did this infant have a antibody >> titer greater than .35 at 10 weeks (sam=3)?". Here I've done it for >> Serotype_1 but will have to do it for all serotypes. I'm not pasting my >> fill >> in code because it's long and clumsy requiring 6 data steps. This is not a >> first occurrence var, so I'm having trouble applying previous similar >> solutions. Thanks all. >> >> data ALLFILLED; >> input ID SAM SEROTYPE_1 SEROTYPE_9V; >> cards; >> 1 0.2 8.22 36.1 >> 1 0.3 1.81 22.1 >> 1 1.0 .41 4.8 >> 1 2.0 .33 3.2 >> 1 3.0 .30 2.1 >> 1 4.0 .23 2.0 >> 1 5.0 .10 1.9 >> 1 6.0 .03 0.9 >> 2 0.2 38.1 100.3 >> 2 0.3 25.81 95.2 >> 2 1.0 22.3 66.2 >> 2 2.0 21.5 54.2 >> 2 3.0 18.3 49.5 >> 2 4.0 6.2 31.1 >> 2 5.0 6.1 23.3 >> 2 6.0 3.1 22 >> ; >> RUN; >> >> PROC PRINT DATA=ALLFILLED; RUN; >> >> DATA CORRELATE; >> SET ALLFILLED; >> IF SAM=3 AND SEROTYPE_1=. THEN CORR3_SER1=.; >> ELSE IF SAM=3 AND SEROTYPE_1>.35 THEN CORR3_SER1=1; >> ELSE IF SAM=3 AND SEROTYPE_1<.35 THEN CORR3_SER1=0; >> BY ID; >> RUN; >> >> PROC PRINT DATA=CORRELATE; RUN; >>
From: Katy Seib on 8 Mar 2010 14:27 Oops, meant to include sas-l.... > Ok this code gives me this output for id=3D3 (others are correct) (i'd tw= eak > it myself if I had the vaguest clue about sql): > > > > ID SAM SEROTYPE_1 SEROTYPE_9V CORR3_SER1 > =C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92= =C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6= =92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92= =C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6= =92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92 > > 3 0.2 38.1 100.3 0 > 3 0.3 25.81 95.2 0 > 3 1 22.3 66.2 0 > 3 2 21.5 54.2 0 > 3 3 . . . > 3 4 6.2 31.1 0 > 3 5 6.1 23.3 0 > 3 6 3.1 22 0 > > when it should be > > > ID SAM SEROTYPE_1 SEROTYPE_9V CORR3_SER1 > =C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92= =C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6= =92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92= =C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6= =92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92 > > 3 0.2 38.1 100.3 . > 3 0.3 25.81 95.2 . > 3 1 22.3 66.2 . > 3 2 21.5 54.2 . > 3 3 . > . . > 3 4 6.2 31.1 . > 3 5 6.1 23.3 . > 3 6 3.1 22 . > > > > > On Mon, Mar 8, 2010 at 2:10 PM, Ya Huang <ya.huang(a)amylin.com> wrote: > >> So you want to set the flag to missing is SEROTYPE is missing? >> If so, a little change would do it: >> >> data ALLFILLED; >> input ID SAM SEROTYPE_1 SEROTYPE_9V; >> cards; >> 1 0.2 8.22 36.1 >> 1 0.3 1.81 22.1 >> 1 1.0 .41 4.8 >> 1 2.0 . 3.2 >> 1 3.0 .30 2.1 >> 1 4.0 .23 2.0 >> 1 5.0 .10 1.9 >> 1 6.0 .03 0.9 >> 2 0.2 38.1 100.3 >> 2 0.3 25.81 95.2 >> 2 1.0 22.3 66.2 >> 2 2.0 21.5 54.2 >> 2 3.0 18.3 49.5 >> 2 4.0 6.2 31.1 >> 2 5.0 6.1 23.3 >> 2 6.0 3.1 22 >> ; >> RUN; >> >> >> proc sql; >> select *, max(case when SAM=3D3 AND SEROTYPE_1>.35 THEN 1 >> when SAM=3D3 AND SEROTYPE_1<.35 THEN 0 >> else . end) + serotype_1 - serotype_1 as >> CORR3_SER1 >> from ALLFILLED >> group by id >> order by id,sam >> ; >> >> ID SAM SEROTYPE_1 SEROTYPE_9V CORR3_SER1 >> --------------------------------------------------- >> 1 0.2 8.22 36.1 0 >> 1 0.3 1.81 22.1 0 >> 1 1 0.41 4.8 0 >> 1 2 . 3.2 . >> 1 3 0.3 2.1 0 >> 1 4 0.23 2 0 >> 1 5 0.1 1.9 0 >> 1 6 0.03 0.9 0 >> 2 0.2 38.1 100.3 1 >> 2 0.3 25.81 95.2 1 >> 2 1 22.3 66.2 1 >> 2 2 21.5 54.2 1 >> 2 3 18.3 49.5 1 >> 2 4 6.2 31.1 1 >> 2 5 6.1 23.3 1 >> 2 6 3.1 22 1 >> >> Note, I change obs=3D4 SEROTYPE_=3D. for testing. >> >> >> >> On Mon, 8 Mar 2010 13:58:22 -0500, Katy Seib <uralva(a)GMAIL.COM> wrote: >> >> >Very close, when I try to add a line to keep missing data =3Dmissing, I= get >> >the same results (CORR3_SER1=3D0 INSTEAD OF .). Lot to learn about sql= .. I >> >added a third person below to simulate missing data (which there is >> plenty >> >of). I'm sure it's a syntax problem on my end..... >> > >> >data ALLFILLED; >> >input ID SAM SEROTYPE_1 SEROTYPE_9V; >> >cards; >> >1 0.2 8.22 36.1 >> >1 0.3 1.81 22.1 >> >1 1.0 .41 4.8 >> >1 2.0 .33 3.2 >> >1 3.0 .30 2.1 >> >1 4.0 .23 2.0 >> >1 5.0 .10 1.9 >> >1 6.0 .03 0.9 >> >2 0.2 38.1 100.3 >> >2 0.3 25.81 95.2 >> >2 1.0 22.3 66.2 >> >2 2.0 21.5 54.2 >> >2 3.0 18.3 49.5 >> >2 4.0 6.2 31.1 >> >2 5.0 6.1 23.3 >> >2 6.0 3.1 22 >> >3 0.2 38.1 100.3 >> >3 0.3 25.81 95.2 >> >3 1.0 22.3 66.2 >> >3 2.0 21.5 54.2 >> >3 3.0 . . >> >3 4.0 6.2 31.1 >> >3 5.0 6.1 23.3 >> >3 6.0 3.1 22 >> >; >> >RUN; >> >proc sql; >> >select *, max(case when SAM=3D3 AND SEROTYPE_1>.35 THEN 1 >> > when SAM=3D3 AND SEROTYPE_1<.35 THEN 0 >> > when sam=3D3 and serotype_1=3D. THEN . >> > else . end) as CORR3_SER1 >> >from ALLFILLED >> >group by id >> >order by id,sam >> >; >> > >> > >> > >> >On Mon, Mar 8, 2010 at 12:14 PM, Katy Seib <uralva(a)gmail.com> wrote: >> > >> >> **Intro >> >> First, thanks To Ron for posting the SAS-L wiki this morning- very >> helpful! >> >> I too am relatively new to posting at the listserv but have enjoyed >> >> reading >> >> for a while. I'm a guest researcher at CDC and a grad student >> finishing >> up >> >> my thesis which is a longitudinal study - much of which is beyond the >> scope >> >> of my formal SAS education - I've had fun finding resources like this= , >> UCLA >> >> and various SAS tips across the internet. I think my question today = is >> >> relatively simple but a problem for me nonetheless. >> >> **Problem >> >> I have been calculating variables and have a LONG but effective >> algorithm >> >> for "filling in" variables that are constant across an observation. >> It's >> >> tedious and I still worry about errors. I know there must be a simpl= er >> way >> >> to fill in this information AS it's calculated. Below I have included >> an >> >> example and sample data. The SAM variable is a crude time variable a= nd >> in >> >> reality there are 23 serotypes. The data step "correlate" gives me th= e >> >> yes/no answer for sam=3D3 only but I need it for all sam (the questio= n >> being >> >> answered by calculating this variable is: "did this infant have a >> antibody >> >> titer greater than .35 at 10 weeks (sam=3D3)?". Here I've done it for >> >> Serotype_1 but will have to do it for all serotypes. I'm not pasting = my >> >> fill >> >> in code because it's long and clumsy requiring 6 data steps. This is >> not >> a >> >> first occurrence var, so I'm having trouble applying previous similar >> >> solutions. Thanks all. >> >> >> >> data ALLFILLED; >> >> input ID SAM SEROTYPE_1 SEROTYPE_9V; >> >> cards; >> >> 1 0.2 8.22 36.1 >> >> 1 0.3 1.81 22.1 >> >> 1 1.0 .41 4.8 >> >> 1 2.0 .33 3.2 >> >> 1 3.0 .30 2.1 >> >> 1 4.0 .23 2.0 >> >> 1 5.0 .10 1.9 >> >> 1 6.0 .03 0.9 >> >> 2 0.2 38.1 100.3 >> >> 2 0.3 25.81 95.2 >> >> 2 1.0 22.3 66.2 >> >> 2 2.0 21.5 54.2 >> >> 2 3.0 18.3 49.5 >> >> 2 4.0 6.2 31.1 >> >> 2 5.0 6.1 23.3 >> >> 2 6.0 3.1 22 >> >> ; >> >> RUN; >> >> >> >> PROC PRINT DATA=3DALLFILLED; RUN; >> >> >> >> DATA CORRELATE; >> >> SET ALLFILLED; >> >> IF SAM=3D3 AND SEROTYPE_1=3D. THEN CORR3_SER1=3D.; >> >> ELSE IF SAM=3D3 AND SEROTYPE_1>.35 THEN CORR3_SER1=3D1; >> >> ELSE IF SAM=3D3 AND SEROTYPE_1<.35 THEN CORR3_SER1=3D0; >> >> BY ID; >> >> RUN; >> >> >> >> PROC PRINT DATA=3DCORRELATE; RUN; >> >> >> > >
From: Ya Huang on 8 Mar 2010 14:57 This one seems to work: proc sql; select *, max(case when SAM=3 AND SEROTYPE_1>.35 THEN 1 when SAM=3 AND SEROTYPE_1<.35 THEN 0 else . end) + max(case when SAM=3 AND ^missing(SEROTYPE_1) THEN 0 else . end) - max(case when SAM=3 AND ^missing(SEROTYPE_1) THEN 0 else . end) as CORR3_SER1 from ALLFILLED group by id order by id,sam ; ID SAM SEROTYPE_1 SEROTYPE_9V CORR3_SER1 ---------------------------------------------------- 1 0.2 8.22 36.1 . 1 0.3 1.81 22.1 . 1 1 0.41 4.8 . 1 2 0.35 3.2 . 1 3 . 2.1 . 1 4 0.23 2 . 1 5 0.1 1.9 . 1 6 0.03 0.9 . 2 0.2 38.1 100.3 1 2 0.3 25.81 95.2 1 2 1 22.3 66.2 1 2 2 21.5 54.2 1 2 3 18.3 49.5 1 2 4 6.2 31.1 1 2 5 6.1 23.3 1 2 6 3.1 22 1 3 0.2 8.22 36.1 0 3 0.3 1.81 22.1 0 3 1 0.41 4.8 0 3 2 0.35 3.2 0 3 3 0.2 2.1 0 3 4 0.23 2 0 3 5 0.1 1.9 0 3 6 0.03 0.9 0 On Mon, 8 Mar 2010 14:27:42 -0500, Katy Seib <uralva(a)GMAIL.COM> wrote: >Oops, meant to include sas-l.... > >> Ok this code gives me this output for id=3 (others are correct) (i'd tweak >> it myself if I had the vaguest clue about sql): >> >> >> >> ID SAM SEROTYPE_1 SEROTYPE_9V CORR3_SER1 >> ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ >> >> 3 0.2 38.1 100.3 0 >> 3 0.3 25.81 95.2 0 >> 3 1 22.3 66.2 0 >> 3 2 21.5 54.2 0 >> 3 3 . . . >> 3 4 6.2 31.1 0 >> 3 5 6.1 23.3 0 >> 3 6 3.1 22 0 >> >> when it should be >> >> >> ID SAM SEROTYPE_1 SEROTYPE_9V CORR3_SER1 >> ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ >> >> 3 0.2 38.1 100.3 . >> 3 0.3 25.81 95.2 . >> 3 1 22.3 66.2 . >> 3 2 21.5 54.2 . >> 3 3 . >> . . >> 3 4 6.2 31.1 . >> 3 5 6.1 23.3 . >> 3 6 3.1 22 . >> >> >> >> >> On Mon, Mar 8, 2010 at 2:10 PM, Ya Huang <ya.huang(a)amylin.com> wrote: >> >>> So you want to set the flag to missing is SEROTYPE is missing? >>> If so, a little change would do it: >>> >>> data ALLFILLED; >>> input ID SAM SEROTYPE_1 SEROTYPE_9V; >>> cards; >>> 1 0.2 8.22 36.1 >>> 1 0.3 1.81 22.1 >>> 1 1.0 .41 4.8 >>> 1 2.0 . 3.2 >>> 1 3.0 .30 2.1 >>> 1 4.0 .23 2.0 >>> 1 5.0 .10 1.9 >>> 1 6.0 .03 0.9 >>> 2 0.2 38.1 100.3 >>> 2 0.3 25.81 95.2 >>> 2 1.0 22.3 66.2 >>> 2 2.0 21.5 54.2 >>> 2 3.0 18.3 49.5 >>> 2 4.0 6.2 31.1 >>> 2 5.0 6.1 23.3 >>> 2 6.0 3.1 22 >>> ; >>> RUN; >>> >>> >>> proc sql; >>> select *, max(case when SAM=3 AND SEROTYPE_1>.35 THEN 1 >>> when SAM=3 AND SEROTYPE_1<.35 THEN 0 >>> else . end) + serotype_1 - serotype_1 as >>> CORR3_SER1 >>> from ALLFILLED >>> group by id >>> order by id,sam >>> ; >>> >>> ID SAM SEROTYPE_1 SEROTYPE_9V CORR3_SER1 >>> --------------------------------------------------- >>> 1 0.2 8.22 36.1 0 >>> 1 0.3 1.81 22.1 0 >>> 1 1 0.41 4.8 0 >>> 1 2 . 3.2 . >>> 1 3 0.3 2.1 0 >>> 1 4 0.23 2 0 >>> 1 5 0.1 1.9 0 >>> 1 6 0.03 0.9 0 >>> 2 0.2 38.1 100.3 1 >>> 2 0.3 25.81 95.2 1 >>> 2 1 22.3 66.2 1 >>> 2 2 21.5 54.2 1 >>> 2 3 18.3 49.5 1 >>> 2 4 6.2 31.1 1 >>> 2 5 6.1 23.3 1 >>> 2 6 3.1 22 1 >>> >>> Note, I change obs=4 SEROTYPE_=. for testing. >>> >>> >>> >>> On Mon, 8 Mar 2010 13:58:22 -0500, Katy Seib <uralva(a)GMAIL.COM> wrote: >>> >>> >Very close, when I try to add a line to keep missing data =missing, I get >>> >the same results (CORR3_SER1=0 INSTEAD OF .). Lot to learn about sql. I >>> >added a third person below to simulate missing data (which there is >>> plenty >>> >of). I'm sure it's a syntax problem on my end..... >>> > >>> >data ALLFILLED; >>> >input ID SAM SEROTYPE_1 SEROTYPE_9V; >>> >cards; >>> >1 0.2 8.22 36.1 >>> >1 0.3 1.81 22.1 >>> >1 1.0 .41 4.8 >>> >1 2.0 .33 3.2 >>> >1 3.0 .30 2.1 >>> >1 4.0 .23 2.0 >>> >1 5.0 .10 1.9 >>> >1 6.0 .03 0.9 >>> >2 0.2 38.1 100.3 >>> >2 0.3 25.81 95.2 >>> >2 1.0 22.3 66.2 >>> >2 2.0 21.5 54.2 >>> >2 3.0 18.3 49.5 >>> >2 4.0 6.2 31.1 >>> >2 5.0 6.1 23.3 >>> >2 6.0 3.1 22 >>> >3 0.2 38.1 100.3 >>> >3 0.3 25.81 95.2 >>> >3 1.0 22.3 66.2 >>> >3 2.0 21.5 54.2 >>> >3 3.0 . . >>> >3 4.0 6.2 31.1 >>> >3 5.0 6.1 23.3 >>> >3 6.0 3.1 22 >>> >; >>> >RUN; >>> >proc sql; >>> >select *, max(case when SAM=3 AND SEROTYPE_1>.35 THEN 1 >>> > when SAM=3 AND SEROTYPE_1<.35 THEN 0 >>> > when sam=3 and serotype_1=. THEN . >>> > else . end) as CORR3_SER1 >>> >from ALLFILLED >>> >group by id >>> >order by id,sam >>> >; >>> > >>> > >>> > >>> >On Mon, Mar 8, 2010 at 12:14 PM, Katy Seib <uralva(a)gmail.com> wrote: >>> > >>> >> **Intro >>> >> First, thanks To Ron for posting the SAS-L wiki this morning- very >>> helpful! >>> >> I too am relatively new to posting at the listserv but have enjoyed >>> >> reading >>> >> for a while. I'm a guest researcher at CDC and a grad student >>> finishing >>> up >>> >> my thesis which is a longitudinal study - much of which is beyond the >>> scope >>> >> of my formal SAS education - I've had fun finding resources like this, >>> UCLA >>> >> and various SAS tips across the internet. I think my question today is >>> >> relatively simple but a problem for me nonetheless. >>> >> **Problem >>> >> I have been calculating variables and have a LONG but effective >>> algorithm >>> >> for "filling in" variables that are constant across an observation. >>> It's >>> >> tedious and I still worry about errors. I know there must be a simpler >>> way >>> >> to fill in this information AS it's calculated. Below I have included >>> an >>> >> example and sample data. The SAM variable is a crude time variable and >>> in >>> >> reality there are 23 serotypes. The data step "correlate" gives me the >>> >> yes/no answer for sam=3 only but I need it for all sam (the question >>> being >>> >> answered by calculating this variable is: "did this infant have a >>> antibody >>> >> titer greater than .35 at 10 weeks (sam=3)?". Here I've done it for >>> >> Serotype_1 but will have to do it for all serotypes. I'm not pasting my >>> >> fill >>> >> in code because it's long and clumsy requiring 6 data steps. This is >>> not >>> a >>> >> first occurrence var, so I'm having trouble applying previous similar >>> >> solutions. Thanks all. >>> >> >>> >> data ALLFILLED; >>> >> input ID SAM SEROTYPE_1 SEROTYPE_9V; >>> >> cards; >>> >> 1 0.2 8.22 36.1 >>> >> 1 0.3 1.81 22.1 >>> >> 1 1.0 .41 4.8 >>> >> 1 2.0 .33 3.2 >>> >> 1 3.0 .30 2.1 >>> >> 1 4.0 .23 2.0 >>> >> 1 5.0 .10 1.9 >>> >> 1 6.0 .03 0.9 >>> >> 2 0.2 38.1 100.3 >>> >> 2 0.3 25.81 95.2 >>> >> 2 1.0 22.3 66.2 >>> >> 2 2.0 21.5 54.2 >>> >> 2 3.0 18.3 49.5 >>> >> 2 4.0 6.2 31.1 >>> >> 2 5.0 6.1 23.3 >>> >> 2 6.0 3.1 22 >>> >> ; >>> >> RUN; >>> >> >>> >> PROC PRINT DATA=ALLFILLED; RUN; >>> >> >>> >> DATA CORRELATE; >>> >> SET ALLFILLED; >>> >> IF SAM=3 AND SEROTYPE_1=. THEN CORR3_SER1=.; >>> >> ELSE IF SAM=3 AND SEROTYPE_1>.35 THEN CORR3_SER1=1; >>> >> ELSE IF SAM=3 AND SEROTYPE_1<.35 THEN CORR3_SER1=0; >>> >> BY ID; >>> >> RUN; >>> >> >>> >> PROC PRINT DATA=CORRELATE; RUN; >>> >> >>> >> >>
From: Muthia Kachirayan on 8 Mar 2010 15:56 Based on the output of Ya, I guess you can as well do by a data step using double Dow-loop. As you are new user, read a paper on DOW. Search for Paul Dorfman. data need; do until(last.id); set allfilled; by id ; if sam =3D 3 and serotype_1 =3D . then corr3_ser1 =3D .; else if sam =3D 3 and serotype_1 > 0.35 then corr3_ser1 =3D 1; else if sam =3D 3 and serotype_1 <=3D 0.35 then corr3_ser1 =3D 0; end; do until(last.id); set allfilled; by id; output; end; run; On Mon, Mar 8, 2010 at 3:27 PM, Katy Seib <uralva(a)gmail.com> wrote: > Oops, meant to include sas-l.... > > > Ok this code gives me this output for id=3D3 (others are correct) (i'd > tweak > > it myself if I had the vaguest clue about sql): > > > > > > > > ID SAM SEROTYPE_1 SEROTYPE_9V CORR3_SER= 1 > > =C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92= =C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6= =92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92= =C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6= =92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92 > > > > 3 0.2 38.1 100.3 = 0 > > 3 0.3 25.81 95.2 = 0 > > 3 1 22.3 66.2 = 0 > > 3 2 21.5 54.2 = 0 > > 3 3 . . = .. > > 3 4 6.2 31.1 = 0 > > 3 5 6.1 23.3 = 0 > > 3 6 3.1 22 = 0 > > > > when it should be > > > > > > ID SAM SEROTYPE_1 SEROTYPE_9V CORR3_SER= 1 > > =C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92= =C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6= =92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92= =C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6= =92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92=C6=92 > > > > 3 0.2 38.1 100.3 = .. > > 3 0.3 25.81 95.2 = .. > > 3 1 22.3 66.2 = .. > > 3 2 21.5 54.2 = .. > > 3 3 . > > . . > > 3 4 6.2 31.1 = .. > > 3 5 6.1 23.3 = .. > > 3 6 3.1 22 = .. > > > > > > > > > > On Mon, Mar 8, 2010 at 2:10 PM, Ya Huang <ya.huang(a)amylin.com> wrote: > > > >> So you want to set the flag to missing is SEROTYPE is missing? > >> If so, a little change would do it: > >> > >> data ALLFILLED; > >> input ID SAM SEROTYPE_1 SEROTYPE_9V; > >> cards; > >> 1 0.2 8.22 36.1 > >> 1 0.3 1.81 22.1 > >> 1 1.0 .41 4.8 > >> 1 2.0 . 3.2 > >> 1 3.0 .30 2.1 > >> 1 4.0 .23 2.0 > >> 1 5.0 .10 1.9 > >> 1 6.0 .03 0.9 > >> 2 0.2 38.1 100.3 > >> 2 0.3 25.81 95.2 > >> 2 1.0 22.3 66.2 > >> 2 2.0 21.5 54.2 > >> 2 3.0 18.3 49.5 > >> 2 4.0 6.2 31.1 > >> 2 5.0 6.1 23.3 > >> 2 6.0 3.1 22 > >> ; > >> RUN; > >> > >> > >> proc sql; > >> select *, max(case when SAM=3D3 AND SEROTYPE_1>.35 THEN 1 > >> when SAM=3D3 AND SEROTYPE_1<.35 THEN 0 > >> else . end) + serotype_1 - serotype_1 as > >> CORR3_SER1 > >> from ALLFILLED > >> group by id > >> order by id,sam > >> ; > >> > >> ID SAM SEROTYPE_1 SEROTYPE_9V CORR3_SER1 > >> --------------------------------------------------- > >> 1 0.2 8.22 36.1 0 > >> 1 0.3 1.81 22.1 0 > >> 1 1 0.41 4.8 0 > >> 1 2 . 3.2 . > >> 1 3 0.3 2.1 0 > >> 1 4 0.23 2 0 > >> 1 5 0.1 1.9 0 > >> 1 6 0.03 0.9 0 > >> 2 0.2 38.1 100.3 1 > >> 2 0.3 25.81 95.2 1 > >> 2 1 22.3 66.2 1 > >> 2 2 21.5 54.2 1 > >> 2 3 18.3 49.5 1 > >> 2 4 6.2 31.1 1 > >> 2 5 6.1 23.3 1 > >> 2 6 3.1 22 1 > >> > >> Note, I change obs=3D4 SEROTYPE_=3D. for testing. > >> > >> > >> > >> On Mon, 8 Mar 2010 13:58:22 -0500, Katy Seib <uralva(a)GMAIL.COM> wrote: > >> > >> >Very close, when I try to add a line to keep missing data =3Dmissing,= I > get > >> >the same results (CORR3_SER1=3D0 INSTEAD OF .). Lot to learn about s= ql. > I > >> >added a third person below to simulate missing data (which there is > >> plenty > >> >of). I'm sure it's a syntax problem on my end..... > >> > > >> >data ALLFILLED; > >> >input ID SAM SEROTYPE_1 SEROTYPE_9V; > >> >cards; > >> >1 0.2 8.22 36.1 > >> >1 0.3 1.81 22.1 > >> >1 1.0 .41 4.8 > >> >1 2.0 .33 3.2 > >> >1 3.0 .30 2.1 > >> >1 4.0 .23 2.0 > >> >1 5.0 .10 1.9 > >> >1 6.0 .03 0.9 > >> >2 0.2 38.1 100.3 > >> >2 0.3 25.81 95.2 > >> >2 1.0 22.3 66.2 > >> >2 2.0 21.5 54.2 > >> >2 3.0 18.3 49.5 > >> >2 4.0 6.2 31.1 > >> >2 5.0 6.1 23.3 > >> >2 6.0 3.1 22 > >> >3 0.2 38.1 100.3 > >> >3 0.3 25.81 95.2 > >> >3 1.0 22.3 66.2 > >> >3 2.0 21.5 54.2 > >> >3 3.0 . . > >> >3 4.0 6.2 31.1 > >> >3 5.0 6.1 23.3 > >> >3 6.0 3.1 22 > >> >; > >> >RUN; > >> >proc sql; > >> >select *, max(case when SAM=3D3 AND SEROTYPE_1>.35 THEN 1 > >> > when SAM=3D3 AND SEROTYPE_1<.35 THEN 0 > >> > when sam=3D3 and serotype_1=3D. THEN . > >> > else . end) as CORR3_SER1 > >> >from ALLFILLED > >> >group by id > >> >order by id,sam > >> >; > >> > > >> > > >> > > >> >On Mon, Mar 8, 2010 at 12:14 PM, Katy Seib <uralva(a)gmail.com> wrote: > >> > > >> >> **Intro > >> >> First, thanks To Ron for posting the SAS-L wiki this morning- very > >> helpful! > >> >> I too am relatively new to posting at the listserv but have enjoye= d > >> >> reading > >> >> for a while. I'm a guest researcher at CDC and a grad student > >> finishing > >> up > >> >> my thesis which is a longitudinal study - much of which is beyond t= he > >> scope > >> >> of my formal SAS education - I've had fun finding resources like > this, > >> UCLA > >> >> and various SAS tips across the internet. I think my question toda= y > is > >> >> relatively simple but a problem for me nonetheless. > >> >> **Problem > >> >> I have been calculating variables and have a LONG but effective > >> algorithm > >> >> for "filling in" variables that are constant across an observation. > >> It's > >> >> tedious and I still worry about errors. I know there must be a > simpler > >> way > >> >> to fill in this information AS it's calculated. Below I have includ= ed > >> an > >> >> example and sample data. The SAM variable is a crude time variable > and > >> in > >> >> reality there are 23 serotypes. The data step "correlate" gives me > the > >> >> yes/no answer for sam=3D3 only but I need it for all sam (the quest= ion > >> being > >> >> answered by calculating this variable is: "did this infant have a > >> antibody > >> >> titer greater than .35 at 10 weeks (sam=3D3)?". Here I've done it f= or > >> >> Serotype_1 but will have to do it for all serotypes. I'm not pastin= g > my > >> >> fill > >> >> in code because it's long and clumsy requiring 6 data steps. This i= s > >> not > >> a > >> >> first occurrence var, so I'm having trouble applying previous simil= ar > >> >> solutions. Thanks all. > >> >> > >> >> data ALLFILLED; > >> >> input ID SAM SEROTYPE_1 SEROTYPE_9V; > >> >> cards; > >> >> 1 0.2 8.22 36.1 > >> >> 1 0.3 1.81 22.1 > >> >> 1 1.0 .41 4.8 > >> >> 1 2.0 .33 3.2 > >> >> 1 3.0 .30 2.1 > >> >> 1 4.0 .23 2.0 > >> >> 1 5.0 .10 1.9 > >> >> 1 6.0 .03 0.9 > >> >> 2 0.2 38.1 100.3 > >> >> 2 0.3 25.81 95.2 > >> >> 2 1.0 22.3 66.2 > >> >> 2 2.0 21.5 54.2 > >> >> 2 3.0 18.3 49.5 > >> >> 2 4.0 6.2 31.1 > >> >> 2 5.0 6.1 23.3 > >> >> 2 6.0 3.1 22 > >> >> ; > >> >> RUN; > >> >> > >> >> PROC PRINT DATA=3DALLFILLED; RUN; > >> >> > >> >> DATA CORRELATE; > >> >> SET ALLFILLED; > >> >> IF SAM=3D3 AND SEROTYPE_1=3D. THEN CORR3_SER1=3D.; > >> >> ELSE IF SAM=3D3 AND SEROTYPE_1>.35 THEN CORR3_SER1=3D1; > >> >> ELSE IF SAM=3D3 AND SEROTYPE_1<.35 THEN CORR3_SER1=3D0; > >> >> BY ID; > >> >> RUN; > >> >> > >> >> PROC PRINT DATA=3DCORRELATE; RUN; > >> >> > >> > > > > >
First
|
Prev
|
Pages: 1 2 Prev: calculating variables across repeated ids Next: Check if a Variable is Null or not |