From: renxue on 2 Aug 2010 06:35 I have a dataset like this (more than one patient_id); patient_id start_date_time stop_date_time 11 1/1/2008 6/1/2008 11 3/1/2008 4/1/2008 11 4/1/2008 10/1/2008 11 6/1/2008 10/1/2008 11 12/1/2008 1/20/2008 11 15/1/2008 22/1/2008 11 24/1/2008 27/1/2008 I want to sum the days of all date ranges for every patient_id . That means :stop_date_time - start_date_time; howerver there are many date ranges overlaped,I only want to sum them once. for example you can see the above date cover dates below: 2008/01/01-2008/01/10 2008/01/12-2008/01/22 2008/01/24-2008/01/27 then the sum equation was 9+10+3=22 I want to get the result of 22 ,then how to write the program when the date ranges are overlaped ?
From: Arthur Tabachneck on 2 Aug 2010 10:09 Zijun, There are a number of ways to solve the problem. Take a look at: http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0106A&L=sas-l&D=1&H=0&O=D&T=1&P=38600 and http://www2.sas.com/proceedings/sugi29/260-29.pdf The following is based on code that Ian Whitlock suggested in the first of those links: data have; input patient_id (start_date_time stop_date_time) (ddmmyy10.); format start_date_time stop_date_time date9.; cards; 11 1/1/2008 6/1/2008 11 3/1/2008 4/1/2008 11 4/1/2008 10/1/2008 11 6/1/2008 10/1/2008 11 12/1/2008 20/1/2008 11 15/1/2008 22/1/2008 11 24/1/2008 27/1/2008 12 1/1/2008 6/1/2008 12 3/1/2008 4/1/2008 12 4/1/2008 10/1/2008 12 6/1/2008 10/1/2008 12 12/1/2008 20/1/2008 12 15/1/2008 22/1/2008 12 24/1/2008 28/1/2008 ; proc sort data = have; by patient_id start_date_time descending stop_date_time; run; data want ( keep = patient_id curstart curend) ; retain curstart curend; format curstart curend date9.; set have; by patient_id; if first.patient_id then do; days=0; curstart = start_date_time; curend = stop_date_time; end; if stop_date_time < stop_date_time then error ; if stop_date_time > curend and curstart <= start_date_time <= curend + 1 then curend = stop_date_time; if start_date_time > curend then do; output; curstart = start_date_time; curend = stop_date_time; end; if last.patient_id then output; run; data want; set want; by patient_id; retain days; if first.patient_id then days=curend-curstart; else days=sum(days,curend-curstart); run; HTH, Art --------------- On Aug 2, 6:35 am, renxue <zijun2...(a)gmail.com> wrote: > I have a dataset like this (more than one patient_id); > patient_id start_date_time stop_date_time > 11 1/1/2008 6/1/2008 > 11 3/1/2008 4/1/2008 > 11 4/1/2008 10/1/2008 > 11 6/1/2008 10/1/2008 > 11 12/1/2008 1/20/2008 > 11 15/1/2008 22/1/2008 > 11 24/1/2008 27/1/2008 > > I want to sum the days of all date ranges for every patient_id . > That means :stop_date_time - start_date_time; > howerver there are many date ranges overlaped,I only want to sum them > once. > for example you can see the above date cover dates below: > 2008/01/01-2008/01/10 > 2008/01/12-2008/01/22 > 2008/01/24-2008/01/27 > then the sum equation was 9+10+3=22 > I want to get the result of 22 ,then how to write the program when > the date ranges are overlaped ?
From: Lou on 2 Aug 2010 13:06 "renxue" <zijun2000(a)gmail.com> wrote in message news:8e9e2393-60b9-473a-a33b-1c2265357f0f(a)m35g2000prn.googlegroups.com... >I have a dataset like this (more than one patient_id); > patient_id start_date_time stop_date_time > 11 1/1/2008 6/1/2008 > 11 3/1/2008 4/1/2008 > 11 4/1/2008 10/1/2008 > 11 6/1/2008 10/1/2008 > 11 12/1/2008 1/20/2008 > 11 15/1/2008 22/1/2008 > 11 24/1/2008 27/1/2008 > > I want to sum the days of all date ranges for every patient_id . > That means :stop_date_time - start_date_time; > howerver there are many date ranges overlaped,I only want to sum them > once. > for example you can see the above date cover dates below: > 2008/01/01-2008/01/10 > 2008/01/12-2008/01/22 > 2008/01/24-2008/01/27 > then the sum equation was 9+10+3=22 > I want to get the result of 22 ,then how to write the program when > the date ranges are overlaped ? A question - why are you counting 9 days for 2008/01/01 - 2008/01/10 instead of 10, and similarly for the other intervals? Also, there's an anamalous line in your posted data: 12/1/2008 1/20/2008 All the other dates look like they're in dd/mm/yyyy format except this one end date. Assuming 1/20/2008 should be 20/1/2008, I'd probably attack it something like this. data fee; input @1 id 2. @4 start ddmmyy10. @14 end ddmmyy10.; do date = start to end; output; end; cards; 11 1/1/2008 6/1/2008 11 3/1/2008 4/1/2008 11 4/1/2008 10/1/2008 11 6/1/2008 10/1/2008 11 12/1/2008 20/1/2008 11 15/1/2008 22/1/2008 11 24/1/2008 27/1/2008 ; proc sort data = fee nodupkey; by id date; proc means data = fee noprint; by id; var date; output out = fie n = n; quit; Which incidentally, gives us a total of 25 days for id 11. If you really don't want to count the first day of the interval, use "do date = start + 1 to end;" instead.
|
Pages: 1 Prev: SAS for phrama professionals Next: THANKS GOD! I GOT $2000 FROM PAYPAL.... |