From: Yaw on
Dear All:

My data has: ageyrs , htcm, seqn and counter(for number of visits per
subject).

I want to select cases who have at least one data ( ageyr and htcm )
per year up to age 18yrs.

So only cases with data at 1yr increments(more points within 1yr is a
bonus for me) from 0 to 18 will make it into my data.

I tried the code below but its not working mostly because I am always
at a lost with arrays. I thought I am working it right.I am obviously
struggling with it.


Any help on this will be appreciated. An expedient route,
reformulation of my plan will be very welcome.


Here is my code.

data today2;
set today;
array npoints(18);
retain npoints(18); by seqn;

do y=0 to 18 by 1;
if y-1 lt ageyrs le y then npoints(y)=1;
if last.seqn = 1 then do;
output;
do y=0 to 18;
npoints(y)=.;
end;
end;end;
run;

Thanks
From: Reeza on
On Apr 7, 12:12 pm, Yaw <link...(a)gmail.com> wrote:
> Dear All:
>
> My data has: ageyrs , htcm, seqn and counter(for number of visits per
> subject).
>
>  I want to select cases who have at least one data ( ageyr and htcm )
> per year up to age 18yrs.
>
> So only cases with data at 1yr increments(more points within 1yr is a
> bonus for me)  from 0 to 18 will make it into my data.
>
> I tried the code below but its not working mostly because I am always
> at a lost with arrays. I thought I am working it right.I am obviously
> struggling with it.
>
> Any help on this will be appreciated.  An expedient route,
> reformulation of my plan will be very welcome.
>
> Here is my code.
>
> data today2;
>         set today;
>         array  npoints(18);
>         retain npoints(18); by seqn;
>
>         do y=0 to 18 by 1;
>         if y-1 lt ageyrs le y then npoints(y)=1;
>         if last.seqn = 1 then do;
>                 output;
>                 do y=0 to 18;
>                 npoints(y)=.;
>                 end;
> end;end;
> run;
>
> Thanks

Arrays in SAS work across a row of data rather than a column of data.

Can you clarify your current data structure and what type of output
you're looking for? Examples are nice :)

Cheers,
Reeza
From: Yaw on
On Apr 7, 3:37 pm, Reeza <fkhurs...(a)hotmail.com> wrote:
> On Apr 7, 12:12 pm, Yaw <link...(a)gmail.com> wrote:
>
>
>
> > Dear All:
>
> > My data has: ageyrs , htcm, seqn and counter(for number of visits per
> > subject).
>
> >  I want to select cases who have at least one data ( ageyr and htcm )
> > per year up to age 18yrs.
>
> > So only cases with data at 1yr increments(more points within 1yr is a
> > bonus for me)  from 0 to 18 will make it into my data.
>
> > I tried the code below but its not working mostly because I am always
> > at a lost with arrays. I thought I am working it right.I am obviously
> > struggling with it.
>
> > Any help on this will be appreciated.  An expedient route,
> > reformulation of my plan will be very welcome.
>
> > Here is my code.
>
> > data today2;
> >         set today;
> >         array  npoints(18);
> >         retain npoints(18); by seqn;
>
> >         do y=0 to 18 by 1;
> >         if y-1 lt ageyrs le y then npoints(y)=1;
> >         if last.seqn = 1 then do;
> >                 output;
> >                 do y=0 to 18;
> >                 npoints(y)=.;
> >                 end;
> > end;end;
> > run;
>
> > Thanks
>
> Arrays in SAS work across a row of data rather than a column of data.
>
> Can you clarify your current data structure and what type of output
> you're looking for?  Examples are nice :)
>
> Cheers,
> Reeza

Thanks.

The data points are by visits. So a subject can have 100 visits over a
period based on their visits.

The fact is I have too many data points with each year(clinical data,
based on visits) so I want to select at most 4 data points within each
year). Cases who do not have data points over the entire age ranges I
am interested in do not make it into my subset.

Data format

Htcm agyrs ID

50 .00 0.000 2
70 ,00 1.50 2
96.50 3.50 2
147.30 7.7016 2
148.85 8.0684 2
150.20 8.7912 2
152.20 9.5359 2
153.10 9.7878 2
153.60 10.0370 2
154.80 10.8008 2
155.40 11.0554 2
158.30 11.8741 2
158.80 12.1068 2
up to age 18.00 yrs


Want:

Htcm agyrs ID

Ht1 age1yr ID

Ht 2 ageyr2 ID
....up to ageyr18 same id.

Thanks,







From: Reeza on
On Apr 7, 2:31 pm, Yaw <link...(a)gmail.com> wrote:
> On Apr 7, 3:37 pm, Reeza <fkhurs...(a)hotmail.com> wrote:
>
>
>
>
>
> > On Apr 7, 12:12 pm, Yaw <link...(a)gmail.com> wrote:
>
> > > Dear All:
>
> > > My data has: ageyrs , htcm, seqn and counter(for number of visits per
> > > subject).
>
> > >  I want to select cases who have at least one data ( ageyr and htcm )
> > > per year up to age 18yrs.
>
> > > So only cases with data at 1yr increments(more points within 1yr is a
> > > bonus for me)  from 0 to 18 will make it into my data.
>
> > > I tried the code below but its not working mostly because I am always
> > > at a lost with arrays. I thought I am working it right.I am obviously
> > > struggling with it.
>
> > > Any help on this will be appreciated.  An expedient route,
> > > reformulation of my plan will be very welcome.
>
> > > Here is my code.
>
> > > data today2;
> > >         set today;
> > >         array  npoints(18);
> > >         retain npoints(18); by seqn;
>
> > >         do y=0 to 18 by 1;
> > >         if y-1 lt ageyrs le y then npoints(y)=1;
> > >         if last.seqn = 1 then do;
> > >                 output;
> > >                 do y=0 to 18;
> > >                 npoints(y)=.;
> > >                 end;
> > > end;end;
> > > run;
>
> > > Thanks
>
> > Arrays in SAS work across a row of data rather than a column of data.
>
> > Can you clarify your current data structure and what type of output
> > you're looking for?  Examples are nice :)
>
> > Cheers,
> > Reeza
>
> Thanks.
>
> The data points are by visits. So a subject can have 100 visits over a
> period based on their visits.
>
> The fact is I have too many data points with each year(clinical data,
> based on visits) so I want to select at most 4 data points within each
> year). Cases who do not have data points over the entire age ranges I
> am interested in do not make it into my subset.
>
> Data format
>
> Htcm      agyrs      ID
>
> 50 .00      0.000     2
> 70 ,00       1.50      2
> 96.50        3.50      2
> 147.30     7.7016     2
> 148.85     8.0684     2
> 150.20     8.7912     2
> 152.20     9.5359     2
> 153.10     9.7878     2
> 153.60    10.0370     2
> 154.80    10.8008     2
> 155.40    11.0554     2
> 158.30    11.8741     2
> 158.80    12.1068     2
> up to age 18.00 yrs
>
> Want:
>
> Htcm      agyrs      ID
>
> Ht1       age1yr     ID
>
> Ht 2      ageyr2    ID
> ...up to ageyr18 same id.
>
> Thanks,- Hide quoted text -
>
> - Show quoted text -

Alright...so its probably a multi-step procedure.
And there may be an easier way to do this.

Step 1. Get up to 4 records per age and identify if there is a record
for every year.
Step 2. Get list of ID's where there are record for every year
Step 3. Get data with all records

this isn't tested so you may need to play with the retain boundaries
and the counting algorithms.
proc sort data=data_initial;
by id ageyrs htcm;
run;

data test1;
set data;
by ID;
*retain variables to count through data;
retain recordcount 0 agecount 0;

*if new subject then start record count again;
if first.id then recordcount=1; else recordcount=recordcount+1;
if first.id then age=0;

*check if age is new age and 1 older then add record;
if floor(ageyrs)=age+1 then agecount=agecount+1;

if recordcount<=4 then output;
run;

data withallages;
set test1;
*records where there was an age for every year;
where agecount=18;
run;

data test2;
merge test1 withallages(in=a);
by id;
where a;
run;

HTH,
Reese
From: Patrick on
The following SQL gives you a list of ID's which comply with your
requirement (at least one data point from year 0 to year 18 within a
ID):

data have;
id=1;
do i=0 to 20;
agyrs=i+ranuni(1);
Htcm=floor(ranuni(1)*16000)/100;
output;
end;
id=2;
do i=0 to 12,14 to 20;
agyrs=i+ranuni(1);
Htcm=floor(ranuni(1)*16000)/100;
output;
end;
run;

proc sql;
/* create table ID_For_Subset as*/
select id,count(*) as N_DistinctYears
from (select distinct id, floor(agyrs) as FloorAgyrs
from have where floor(agyrs)<=18)
group by id
having calculated N_DistinctYears=19
;
quit;


You could now use this list of ID to select the data belonging to
these ID's.

You haven't told us how you want to select max 4 obs per year and ID:
Would it be the first for obs, should it be random or some other
selection pattern?

HTH
Patrick