From: lisiqi77 on 20 Apr 2006 18:39 Hey, I have a dataset with 5 variables like the following: code year month bench_mo x; a 1994 1 12 0.21 a 1994 2 12 0.22 a 1994 3 12 0.26 ... a 1994 12 12 0.20 a 1995 1 12 0.11 ... a 1995 12 12 0.20 ... b 1994 1 8 0.20 b 1994 2 8 0.32 ... In other words, for each code, I have x data for every month of yr1994-2005. And bench_month differs for each code. My goal is to calculate the variance of x during a year with its yearend defined as 10 months after the bench_month. For example, for code a with bench_month equal 12, I'd like to know the variance of x from 1994 month 11 to 1995 month 10, etc. Please help if you have any good idea... Thanks, Tracy
From: Dale McLerran on 20 Apr 2006 19:43 --- lisiqi77(a)YAHOO.COM wrote: > Hey, > > I have a dataset with 5 variables like the following: > > code year month bench_mo x; > > a 1994 1 12 0.21 > a 1994 2 12 0.22 > a 1994 3 12 0.26 > ... > a 1994 12 12 0.20 > a 1995 1 12 0.11 > ... > a 1995 12 12 0.20 > ... > b 1994 1 8 0.20 > b 1994 2 8 0.32 > ... > > In other words, for each code, I have x data for every month of > yr1994-2005. And bench_month differs for each code. > > My goal is to calculate the variance of x during a year with its > yearend defined as 10 months after the bench_month. For example, for > code a with bench_month equal 12, I'd like to know the variance of x > from 1994 month 11 to 1995 month 10, etc. > > Please help if you have any good idea... > > Thanks, > Tracy > Tracy, I'm not going to ask how this problem arises. I'll take it that calculating the variance among the data points within the 1 year window starting one month before the benchmark month to 10 months after the benchmark month is acceptable for the problem that you must address. While you have not explicitly stated so, I will assume that the benchmark month is in month 2 or later of year 1994 but no later than month 2 of year 1995. If the benchmark month is outside that window, then you would not be able to obtain a year's worth of data commencing from the month prior to the benchmark month through 10 months after the benchmark month. Actually, this presents a bit of a problem in that the benchmark month=2 could belong to either 1994 or 1995. I will assume that a benchmark month=2 belongs to 1994 only. Given this assumptions and assuming that your data are sorted by code, there is an easy solution to your problem. You have all the information in the benchmark month to compose a WHERE condition which would restrict the data to the appropriate interval. For benchmark month=1, then we would require the month=12 observation from 1994 and all 1995 observations up through month 11. All other benchmark months would allow us to construct a condition where we accept all 1994 observations with month greather than or equal to one less than the benchmark month and any 1995 observations with month less than or equal to the benchmark month minus 2. Throw in by variable processing to get the variance separately for each code level, and you have your solution. proc means data=mydata; where ( benchmark=1 AND ( (year=1994 AND month=12) OR (year=1995 AND month<=11) ) ) OR ( (year=1994 AND month>=(benchmark-1)) OR (year=1995 AND month<=(benchmark-2) ) ); by code; var x; run; Of course, you may want to use an OUTPUT statement or ODS to capture the variance values and place them in a data set for some subsequent need. HTH, Dale --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra(a)NO_SPAMfhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 --------------------------------------- __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
From: Muthia Kachirayan on 20 Apr 2006 19:51 Tracy, You say yearend defined as 10 months after bench_month. Then you want variance for 12 months (1994/11 to 1995/10). If you can provide sample data for 2 years and the output derived from that data will help someone to provide you a good solution. Muthia Kachirayan On 4/20/06, lisiqi77(a)yahoo.com <lisiqi77(a)yahoo.com> wrote: > > Hey, > > I have a dataset with 5 variables like the following: > > code year month bench_mo x; > > a 1994 1 12 0.21 > a 1994 2 12 0.22 > a 1994 3 12 0.26 > ... > a 1994 12 12 0.20 > a 1995 1 12 0.11 > ... > a 1995 12 12 0.20 > ... > b 1994 1 8 0.20 > b 1994 2 8 0.32 > ... > > In other words, for each code, I have x data for every month of > yr1994-2005. And bench_month differs for each code. > > My goal is to calculate the variance of x during a year with its > yearend defined as 10 months after the bench_month. For example, for > code a with bench_month equal 12, I'd like to know the variance of x > from 1994 month 11 to 1995 month 10, etc. > > Please help if you have any good idea... > > Thanks, > Tracy >
From: lisiqi77 on 20 Apr 2006 21:48 Hi, Dale, Thanks for your reply. Actually my data ranges from 1994 to 2004, which somehow conplicates the problem. The solution you proposed works well for two years of data, i.e., 1994 and 1995. But for my sample period, would it be adequate? And upon the request of Muthia, I include a more complete sample data as follows: code year month bench_mo x beg_mo end_mo a 1994 1 8 0.84 7 6 a 1994 2 8 0.01 7 6 a 1994 3 8 0.38 7 6 a 1994 4 8 0.85 7 6 a 1994 5 8 0.68 7 6 a 1994 6 8 0.25 7 6 a 1994 7 8 0.88 7 6 a 1994 8 8 0.75 7 6 a 1994 9 8 0.13 7 6 a 1994 10 8 0.30 7 6 a 1994 11 8 0.04 7 6 a 1994 12 8 0.77 7 6 a 1995 1 8 0.57 7 6 a 1995 2 8 0.94 7 6 a 1995 3 8 0.03 7 6 a 1995 4 8 0.50 7 6 a 1995 5 8 0.69 7 6 a 1995 6 8 0.72 7 6 a 1995 7 8 0.62 7 6 a 1995 8 8 0.40 7 6 a 1995 9 8 0.08 7 6 a 1995 10 8 0.69 7 6 a 1995 11 8 0.39 7 6 a 1995 12 8 0.84 7 6 b 2001 1 1 0.24 12 11 b 2001 2 1 0.77 12 11 b 2001 3 1 0.49 12 11 b 2001 4 1 0.40 12 11 b 2001 5 1 0.71 12 11 b 2001 6 1 0.25 12 11 b 2001 7 1 0.71 12 11 b 2001 8 1 0.63 12 11 b 2001 9 1 0.35 12 11 b 2001 10 1 0.30 12 11 b 2001 11 1 0.63 12 11 b 2001 12 1 0.25 12 11 b 2002 1 1 0.15 12 11 b 2002 2 1 0.99 12 11 b 2002 3 1 0.37 12 11 b 2002 4 1 0.00 12 11 b 2002 5 1 0.50 12 11 b 2002 6 1 0.42 12 11 b 2002 7 1 0.09 12 11 b 2002 8 1 0.84 12 11 b 2002 9 1 0.97 12 11 b 2002 10 1 0.97 12 11 b 2002 11 1 0.85 12 11 b 2002 12 1 0.68 12 11 Note that I constructed two additional variables, beg_mo and end_mo based on the defintion of yearend as 10+ of benchmark month. My goal is to calculate the variance of X during the self-defined year. Those months not in the year range are discarded. The output variance is 0.11 for code a and 0.14 for code b. Thanks, Tracy
From: Jiann-Shiun Huang on 20 Apr 2006 21:33 Tracy: See if the following code fits your need. Output follows the code. data InData; input code $ year month bench_mo x; datalines; a 1994 1 12 0.21 a 1994 2 12 0.22 a 1994 3 12 0.26 a 1994 4 12 0.21 a 1994 5 12 0.22 a 1994 6 12 0.26 a 1994 7 12 0.21 a 1994 8 12 0.22 a 1994 9 12 0.26 a 1994 10 12 0.20 a 1994 11 12 0.20 a 1994 12 12 0.20 a 1995 1 9 0.11 a 1995 2 9 0.13 a 1995 3 9 0.03 a 1995 4 9 0.11 a 1995 5 9 0.13 a 1995 6 9 0.03 a 1995 7 9 0.11 a 1995 8 9 0.13 a 1995 9 9 0.03 a 1995 10 9 0.11 a 1995 11 9 0.13 a 1995 12 9 0.20 b 1994 1 2 0.20 b 1994 2 2 0.32 b 1994 3 2 0.22 b 1994 4 2 0.20 b 1994 5 2 0.32 b 1994 6 2 0.22 b 1994 7 2 0.20 b 1994 8 2 0.32 b 1994 9 2 0.22 b 1994 10 2 0.20 b 1994 11 2 0.32 b 1994 12 2 0.22 b 1995 1 3 0.20 b 1995 2 3 0.32 b 1995 3 3 0.22 b 1995 4 3 0.20 b 1995 5 3 0.32 b 1995 6 3 0.22 b 1995 7 3 0.22 b 1995 8 3 0.20 b 1995 9 3 0.20 b 1995 10 3 0.32 b 1995 11 3 0.22 b 1995 12 3 0.22 b 1996 1 3 0.20 b 1996 2 3 0.32 b 1996 3 3 0.25 ; run; proc sort data=InData; by code year month; run; data GroupData(keep=code year month bench_mo x Group); length ThisCode $ 1 LastCode $ 1; retain ThisYear ThisBench LastYear LastBench ThisCode LastCode; set InData; by code year; if first.code then do; LastYear=year-1; LastBench=1; LastCode=code; ThisYear=year; ThisBench=bench_mo; ThisCode=code; end; else do; if first.year then do; LastYear=ThisYear; LastBench=ThisBench; LastCode=ThisCode; ThisYear=year; ThisBench=bench_mo; ThisCode=code; end; end; if -1 le (12*(year-ThisYear)+month-ThisBench) le 10 then do; Group=ThisCode || put(ThisYear,4.); output; end; if -1 le (12*(year-LastYear)+month-LastBench) le 10 then do; Group=LastCode || put(LastYear,4.); output; end; run; proc sort data=GroupData; by Group year month; run; proc means data=GroupData MEAN VAR; var x; class Group; run; ***** Output ***** The SAS System 11:03 Thursday, April 20, 2006 73 The MEANS Procedure Analysis Variable : x N Group Obs Mean Variance ******************************************** a1994 12 0.1100000 0.0033273 a1995 5 0.1200000 0.0037000 b1994 12 0.2466667 0.0030061 b1995 12 0.2383333 0.0025061 b1996 2 0.2850000 0.0024500 ******************************************** J S Huang 1-515-557-3987 fax 1-515-557-2422 >>> Muthia Kachirayan <muthia.kachirayan(a)GMAIL.COM> 04/20/06 6:51 PM >>> Tracy, You say yearend defined as 10 months after bench_month. Then you want variance for 12 months (1994/11 to 1995/10). If you can provide sample data for 2 years and the output derived from that data will help someone to provide you a good solution. Muthia Kachirayan On 4/20/06, lisiqi77(a)yahoo.com <lisiqi77(a)yahoo.com> wrote: > > Hey, > > I have a dataset with 5 variables like the following: > > code year month bench_mo x; > > a 1994 1 12 0.21 > a 1994 2 12 0.22 > a 1994 3 12 0.26 > ... > a 1994 12 12 0.20 > a 1995 1 12 0.11 > ... > a 1995 12 12 0.20 > ... > b 1994 1 8 0.20 > b 1994 2 8 0.32 > ... > > In other words, for each code, I have x data for every month of > yr1994-2005. And bench_month differs for each code. > > My goal is to calculate the variance of x during a year with its > yearend defined as 10 months after the bench_month. For example, for > code a with bench_month equal 12, I'd like to know the variance of x > from 1994 month 11 to 1995 month 10, etc. > > Please help if you have any good idea... > > Thanks, > Tracy >
|
Next
|
Last
Pages: 1 2 Prev: Example: using infile magic to parse SYSPARM for named parameters Next: jackknife concept |