From: Yaw on 7 Apr 2010 15:12 Dear All: My data has: ageyrs , htcm, seqn and counter(for number of visits per subject). I want to select cases who have at least one data ( ageyr and htcm ) per year up to age 18yrs. So only cases with data at 1yr increments(more points within 1yr is a bonus for me) from 0 to 18 will make it into my data. I tried the code below but its not working mostly because I am always at a lost with arrays. I thought I am working it right.I am obviously struggling with it. Any help on this will be appreciated. An expedient route, reformulation of my plan will be very welcome. Here is my code. data today2; set today; array npoints(18); retain npoints(18); by seqn; do y=0 to 18 by 1; if y-1 lt ageyrs le y then npoints(y)=1; if last.seqn = 1 then do; output; do y=0 to 18; npoints(y)=.; end; end;end; run; Thanks
From: Reeza on 7 Apr 2010 16:37 On Apr 7, 12:12 pm, Yaw <link...(a)gmail.com> wrote: > Dear All: > > My data has: ageyrs , htcm, seqn and counter(for number of visits per > subject). > > I want to select cases who have at least one data ( ageyr and htcm ) > per year up to age 18yrs. > > So only cases with data at 1yr increments(more points within 1yr is a > bonus for me) from 0 to 18 will make it into my data. > > I tried the code below but its not working mostly because I am always > at a lost with arrays. I thought I am working it right.I am obviously > struggling with it. > > Any help on this will be appreciated. An expedient route, > reformulation of my plan will be very welcome. > > Here is my code. > > data today2; > set today; > array npoints(18); > retain npoints(18); by seqn; > > do y=0 to 18 by 1; > if y-1 lt ageyrs le y then npoints(y)=1; > if last.seqn = 1 then do; > output; > do y=0 to 18; > npoints(y)=.; > end; > end;end; > run; > > Thanks Arrays in SAS work across a row of data rather than a column of data. Can you clarify your current data structure and what type of output you're looking for? Examples are nice :) Cheers, Reeza
From: Yaw on 7 Apr 2010 17:31 On Apr 7, 3:37 pm, Reeza <fkhurs...(a)hotmail.com> wrote: > On Apr 7, 12:12 pm, Yaw <link...(a)gmail.com> wrote: > > > > > Dear All: > > > My data has: ageyrs , htcm, seqn and counter(for number of visits per > > subject). > > > I want to select cases who have at least one data ( ageyr and htcm ) > > per year up to age 18yrs. > > > So only cases with data at 1yr increments(more points within 1yr is a > > bonus for me) from 0 to 18 will make it into my data. > > > I tried the code below but its not working mostly because I am always > > at a lost with arrays. I thought I am working it right.I am obviously > > struggling with it. > > > Any help on this will be appreciated. An expedient route, > > reformulation of my plan will be very welcome. > > > Here is my code. > > > data today2; > > set today; > > array npoints(18); > > retain npoints(18); by seqn; > > > do y=0 to 18 by 1; > > if y-1 lt ageyrs le y then npoints(y)=1; > > if last.seqn = 1 then do; > > output; > > do y=0 to 18; > > npoints(y)=.; > > end; > > end;end; > > run; > > > Thanks > > Arrays in SAS work across a row of data rather than a column of data. > > Can you clarify your current data structure and what type of output > you're looking for? Examples are nice :) > > Cheers, > Reeza Thanks. The data points are by visits. So a subject can have 100 visits over a period based on their visits. The fact is I have too many data points with each year(clinical data, based on visits) so I want to select at most 4 data points within each year). Cases who do not have data points over the entire age ranges I am interested in do not make it into my subset. Data format Htcm agyrs ID 50 .00 0.000 2 70 ,00 1.50 2 96.50 3.50 2 147.30 7.7016 2 148.85 8.0684 2 150.20 8.7912 2 152.20 9.5359 2 153.10 9.7878 2 153.60 10.0370 2 154.80 10.8008 2 155.40 11.0554 2 158.30 11.8741 2 158.80 12.1068 2 up to age 18.00 yrs Want: Htcm agyrs ID Ht1 age1yr ID Ht 2 ageyr2 ID ....up to ageyr18 same id. Thanks,
From: Reeza on 7 Apr 2010 18:54 On Apr 7, 2:31 pm, Yaw <link...(a)gmail.com> wrote: > On Apr 7, 3:37 pm, Reeza <fkhurs...(a)hotmail.com> wrote: > > > > > > > On Apr 7, 12:12 pm, Yaw <link...(a)gmail.com> wrote: > > > > Dear All: > > > > My data has: ageyrs , htcm, seqn and counter(for number of visits per > > > subject). > > > > I want to select cases who have at least one data ( ageyr and htcm ) > > > per year up to age 18yrs. > > > > So only cases with data at 1yr increments(more points within 1yr is a > > > bonus for me) from 0 to 18 will make it into my data. > > > > I tried the code below but its not working mostly because I am always > > > at a lost with arrays. I thought I am working it right.I am obviously > > > struggling with it. > > > > Any help on this will be appreciated. An expedient route, > > > reformulation of my plan will be very welcome. > > > > Here is my code. > > > > data today2; > > > set today; > > > array npoints(18); > > > retain npoints(18); by seqn; > > > > do y=0 to 18 by 1; > > > if y-1 lt ageyrs le y then npoints(y)=1; > > > if last.seqn = 1 then do; > > > output; > > > do y=0 to 18; > > > npoints(y)=.; > > > end; > > > end;end; > > > run; > > > > Thanks > > > Arrays in SAS work across a row of data rather than a column of data. > > > Can you clarify your current data structure and what type of output > > you're looking for? Examples are nice :) > > > Cheers, > > Reeza > > Thanks. > > The data points are by visits. So a subject can have 100 visits over a > period based on their visits. > > The fact is I have too many data points with each year(clinical data, > based on visits) so I want to select at most 4 data points within each > year). Cases who do not have data points over the entire age ranges I > am interested in do not make it into my subset. > > Data format > > Htcm agyrs ID > > 50 .00 0.000 2 > 70 ,00 1.50 2 > 96.50 3.50 2 > 147.30 7.7016 2 > 148.85 8.0684 2 > 150.20 8.7912 2 > 152.20 9.5359 2 > 153.10 9.7878 2 > 153.60 10.0370 2 > 154.80 10.8008 2 > 155.40 11.0554 2 > 158.30 11.8741 2 > 158.80 12.1068 2 > up to age 18.00 yrs > > Want: > > Htcm agyrs ID > > Ht1 age1yr ID > > Ht 2 ageyr2 ID > ...up to ageyr18 same id. > > Thanks,- Hide quoted text - > > - Show quoted text - Alright...so its probably a multi-step procedure. And there may be an easier way to do this. Step 1. Get up to 4 records per age and identify if there is a record for every year. Step 2. Get list of ID's where there are record for every year Step 3. Get data with all records this isn't tested so you may need to play with the retain boundaries and the counting algorithms. proc sort data=data_initial; by id ageyrs htcm; run; data test1; set data; by ID; *retain variables to count through data; retain recordcount 0 agecount 0; *if new subject then start record count again; if first.id then recordcount=1; else recordcount=recordcount+1; if first.id then age=0; *check if age is new age and 1 older then add record; if floor(ageyrs)=age+1 then agecount=agecount+1; if recordcount<=4 then output; run; data withallages; set test1; *records where there was an age for every year; where agecount=18; run; data test2; merge test1 withallages(in=a); by id; where a; run; HTH, Reese
From: Patrick on 8 Apr 2010 09:30 The following SQL gives you a list of ID's which comply with your requirement (at least one data point from year 0 to year 18 within a ID): data have; id=1; do i=0 to 20; agyrs=i+ranuni(1); Htcm=floor(ranuni(1)*16000)/100; output; end; id=2; do i=0 to 12,14 to 20; agyrs=i+ranuni(1); Htcm=floor(ranuni(1)*16000)/100; output; end; run; proc sql; /* create table ID_For_Subset as*/ select id,count(*) as N_DistinctYears from (select distinct id, floor(agyrs) as FloorAgyrs from have where floor(agyrs)<=18) group by id having calculated N_DistinctYears=19 ; quit; You could now use this list of ID to select the data belonging to these ID's. You haven't told us how you want to select max 4 obs per year and ID: Would it be the first for obs, should it be random or some other selection pattern? HTH Patrick
|
Pages: 1 Prev: xlm file size creating tagsets.excelxp ods Next: Reading in BibTeX files using SAS |