From: Akshaya on 22 Oct 2009 15:53 Here's a datastep solution, might need changes for your real data: Data have; input id y; _dif=y-0.5; _g=(y<0.5); cards; 1 1 1 1 1 1 1 0.8 1 0.6 1 0.6 1 0.4 1 0.2 2 1 2 1 2 0.4 2 0 3 1 3 1 3 0.8 3 0.8 ; Proc sql; create table have1 as select *,( max(_g)=0 ) as _gg from have(where=(y^=0)) group by id order by id,_g,y; Quit; Data want(drop=_:); set have1; by id _g y; if ^_g then _d+first.y-_d*first._g; else _d=0; if (^_g and _d=1) or (_g and last._g) or (_gg and _d in (1,2)) then output; Run; AkshayA! On Thu, Oct 22, 2009 at 2:15 PM, olivesecret <olivesecret(a)gmail.com> wrote: > I have a large data set consisting of subject id, response y and other > interesting variables. A subset of data is like this: > > ID Y ... > 1 1 > 1 1 > 1 1 > 1 0.8 > 1 0.6 > 1 0.6 > 1 0.4 > 1 0.2 > 2 1 > 2 1 > 2 0.4 > 2 0 > 3 1 > 3 1 > 3 0.8 > 3 0.8 > 4 1 > ... > > What I need do is for each ID, find the two observations, with one > having y immediately larger than 0.5 and the other having y > immediately smaller 0.5. For the example above, then the observations > needed for ID=1 are ID=1 y=0.6 and ID=1 y=0.4, and the observations > needed for ID=2 are ID=2 y=1 and ID=2 y=0.4. For ID=3, since there are > no observations where y is less than 0.5, then I need the the two obs > which having y immediately larger than 0.5, which are ID=3 y=1 and > ID=3 y=0.8. > Any hints? > Thanks a lot! >
First
|
Prev
|
Pages: 1 2 Prev: Quality of logistic regression model Next: Proc gplot annotate question |