Prev: How to resolve macro variable in double qoutes inside single quotes
Next: Ordering Transposed Columns
From: db on 14 Jun 2010 16:38 Hi, I would like to get correlation coefficient that calculates obs 1~2, obs 1~3, obs 1~4, obs 1~5, obs 1~6 ... etc Thanks, db data input input date :mmddyy10. time :time8.0 stock1 stock2; format date:mmddyy10. time :time8.0; cards; 01/04/2010 7:00:00 5989.5 1115.25 01/04/2010 7:01:00 5998.5 1116 01/04/2010 7:01:00 5998.5 1116 01/04/2010 7:02:00 5996.5 1115.75 01/04/2010 7:05:00 6000 1115.5 01/04/2010 20:55:00 6042 1128.5 01/04/2010 20:56:00 6040.5 1128.5 01/04/2010 20:57:00 6040.5 1128.75 01/04/2010 20:58:00 6040.5 1128.5 01/04/2010 20:59:00 6040.5 1128.5 01/04/2010 21:00:00 6036 1128.75 01/05/2010 7:00:00 6035.5 1128.5 01/05/2010 7:01:00 6037 1128.5 01/05/2010 7:02:00 6037 1128.5 01/05/2010 7:03:00 6038 1128.5 01/05/2010 7:06:00 6038.5 1128.25 01/05/2010 20:56:00 6042 1131.5 01/05/2010 20:57:00 6042.5 1132 01/05/2010 20:58:00 6042 1132 01/05/2010 20:59:00 6041.5 1131.75 01/05/2010 21:00:00 6041.5 1132.25 ; run; output date time stock1 stock2 R (by date) 01/04/2010 7:00:00 5989.5 1115.25 01/04/2010 7:01:00 5998.5 1116 .9999 ( R based on obs 1~2 ) 01/04/2010 7:01:00 5998.5 1116 .9999 (R based on obs 1~3) 01/04/2010 7:02:00 5996.5 1115.75 .9999 (R based on obs 1~4) 01/04/2010 7:05:00 6000 1115.5 .9999 (R based on obs 1~5) 01/04/2010 20:55:00 6042 1128.5 .9999 (R based on obas 1~6) 01/04/2010 20:56:00 6040.5 1128.5 .9999 (R based on obas 1~7) 01/04/2010 20:57:00 6040.5 1128.75 .9999 (R based on obas 1~8) 01/04/2010 20:58:00 6040.5 1128.5 .9999 (R based on obas 1~9) 01/04/2010 20:59:00 6040.5 1128.5 .9999 (R based on obas 1~10) 01/04/2010 21:00:00 6036 1128.75 .9999 (R based on obs 1~ 11) 01/05/2010 7:00:00 6035.5 1128.5 01/05/2010 7:01:00 6037 1128.5 .9999 (R based on obs 12 ~13) 01/05/2010 7:02:00 6037 1128.5 .9999 ( R based on obs 12 ~14) 01/05/2010 7:03:00 6038 1128.5 .9999 ( R based on obs 12 ~15) 01/05/2010 7:06:00 6038.5 1128.25 .9999 ( R based on obs 12 ~16) 01/05/2010 20:56:00 6042 1131.5 .9999 ( R based on obs 12 ~17) 01/05/2010 20:57:00 6042.5 1132 .9999 ( R based on obs 12 ~18) 01/05/2010 20:58:00 6042 1132 .9999 ( R based on obs 12 ~19) 01/05/2010 20:59:00 6041.5 1131.75 .9999 ( R based on obs 12 ~20) 01/05/2010 21:00:00 6041.5 1132.25 .9999 ( R based on obs 12 ~21)
From: Ya on 14 Jun 2010 18:26 On Jun 14, 1:38 pm, db <daronnebonn...(a)gmail.com> wrote: > Hi, I would like to get correlation coefficient that calculates obs > 1~2, obs 1~3, obs 1~4, obs 1~5, obs 1~6 ... etc > > Thanks, db > > data input > input date :mmddyy10. time :time8.0 stock1 stock2; > format date:mmddyy10. time :time8.0; > cards; > 01/04/2010 7:00:00 5989.5 1115.25 > 01/04/2010 7:01:00 5998.5 1116 > 01/04/2010 7:01:00 5998.5 1116 > 01/04/2010 7:02:00 5996.5 1115.75 > 01/04/2010 7:05:00 6000 1115.5 > 01/04/2010 20:55:00 6042 1128.5 > 01/04/2010 20:56:00 6040.5 1128.5 > 01/04/2010 20:57:00 6040.5 1128.75 > 01/04/2010 20:58:00 6040.5 1128.5 > 01/04/2010 20:59:00 6040.5 1128.5 > 01/04/2010 21:00:00 6036 1128.75 > 01/05/2010 7:00:00 6035.5 1128.5 > 01/05/2010 7:01:00 6037 1128.5 > 01/05/2010 7:02:00 6037 1128.5 > 01/05/2010 7:03:00 6038 1128.5 > 01/05/2010 7:06:00 6038.5 1128.25 > 01/05/2010 20:56:00 6042 1131.5 > 01/05/2010 20:57:00 6042.5 1132 > 01/05/2010 20:58:00 6042 1132 > 01/05/2010 20:59:00 6041.5 1131.75 > 01/05/2010 21:00:00 6041.5 1132.25 > ; > run; > > output > date time stock1 stock2 R (by date) > 01/04/2010 7:00:00 5989.5 1115.25 > 01/04/2010 7:01:00 5998.5 1116 .9999 ( R based on obs 1~2 ) > 01/04/2010 7:01:00 5998.5 1116 .9999 (R based on obs 1~3) > 01/04/2010 7:02:00 5996.5 1115.75 .9999 (R based on obs 1~4) > 01/04/2010 7:05:00 6000 1115.5 .9999 (R based on obs 1~5) > 01/04/2010 20:55:00 6042 1128.5 .9999 (R based on obas 1~6) > 01/04/2010 20:56:00 6040.5 1128.5 .9999 (R based on obas 1~7) > 01/04/2010 20:57:00 6040.5 1128.75 .9999 (R based on obas 1~8) > 01/04/2010 20:58:00 6040.5 1128.5 .9999 (R based on obas 1~9) > 01/04/2010 20:59:00 6040.5 1128.5 .9999 (R based on obas 1~10) > 01/04/2010 21:00:00 6036 1128.75 .9999 (R based on obs 1~ 11) > 01/05/2010 7:00:00 6035.5 1128.5 > 01/05/2010 7:01:00 6037 1128.5 .9999 (R based on obs 12 ~13) > 01/05/2010 7:02:00 6037 1128.5 .9999 ( R based on obs 12 > ~14) > 01/05/2010 7:03:00 6038 1128.5 .9999 ( R based on obs 12 > ~15) > 01/05/2010 7:06:00 6038.5 1128.25 .9999 ( R based on obs 12 ~16) > 01/05/2010 20:56:00 6042 1131.5 .9999 ( R based on obs 12 > ~17) > 01/05/2010 20:57:00 6042.5 1132 .9999 ( R based on obs 12 ~18) > 01/05/2010 20:58:00 6042 1132 .9999 ( R based on obs > 12 ~19) > 01/05/2010 20:59:00 6041.5 1131.75 .9999 ( R based on obs 12 ~20) > 01/05/2010 21:00:00 6041.5 1132.25 .9999 ( R based on obs 12 ~21) You can use proc sql to construct the needed dataset first, which also creates a group var for later "by processing": data have1; set have; by date time; if first.date then n_=0; n_+1; run; proc sql; create table have2 as select a.date,a.stock1,a.stock2,'1 to '||put(b.n_,z2.) as grp from have1 a, have1 b where a.date=b.date and a.n_<=b.n_ order by a.date,grp,a.time ; quit; have2 looks like this: Obs date stock1 stock2 grp 1 01/04/2010 5989.5 1115.25 1 to 01 2 01/04/2010 5989.5 1115.25 1 to 02 3 01/04/2010 5998.5 1116.00 1 to 02 4 01/04/2010 5989.5 1115.25 1 to 03 5 01/04/2010 5998.5 1116.00 1 to 03 6 01/04/2010 5998.5 1116.00 1 to 03 7 01/04/2010 5989.5 1115.25 1 to 04 8 01/04/2010 5998.5 1116.00 1 to 04 9 01/04/2010 5998.5 1116.00 1 to 04 10 01/04/2010 5996.5 1115.75 1 to 04 11 01/04/2010 5989.5 1115.25 1 to 05 12 01/04/2010 5998.5 1116.00 1 to 05 13 01/04/2010 5998.5 1116.00 1 to 05 14 01/04/2010 5996.5 1115.75 1 to 05 15 01/04/2010 6000.0 1115.50 1 to 05 16 01/04/2010 5989.5 1115.25 1 to 06 17 01/04/2010 5998.5 1116.00 1 to 06 18 01/04/2010 5998.5 1116.00 1 to 06 19 01/04/2010 5996.5 1115.75 1 to 06 20 01/04/2010 6000.0 1115.50 1 to 06 21 01/04/2010 6042.0 1128.50 1 to 06 ..... With have2, you can use proc corr and "by processing" to get what you need: ods listing close; ods output PearsonCorr=need(where=(variable='stock2')); proc corr data=have2; by date grp; var stock1 stock2; run; ods listing; proc print label; var date grp stock1; label stock1='R'; run; Obs date grp R 1 01/04/2010 1 to 01 . 2 01/04/2010 1 to 02 1.00000 3 01/04/2010 1 to 03 1.00000 4 01/04/2010 1 to 04 0.99313 5 01/04/2010 1 to 05 0.67305 6 01/04/2010 1 to 06 0.98630 7 01/04/2010 1 to 07 0.99139 8 01/04/2010 1 to 08 0.99308 9 01/04/2010 1 to 09 0.99411 10 01/04/2010 1 to 10 0.99474 11 01/04/2010 1 to 11 0.99258 12 01/05/2010 1 to 01 . 13 01/05/2010 1 to 02 . 14 01/05/2010 1 to 03 . 15 01/05/2010 1 to 04 . 16 01/05/2010 1 to 05 -0.63134 17 01/05/2010 1 to 06 0.85891 18 01/05/2010 1 to 07 0.91995 19 01/05/2010 1 to 08 0.93372 20 01/05/2010 1 to 09 0.93834 21 01/05/2010 1 to 10 0.93508 HTH Ya
From: jwl on 14 Jun 2010 21:22
On Jun 14, 3:26 pm, Ya <huang8...(a)gmail.com> wrote: > On Jun 14, 1:38 pm, db <daronnebonn...(a)gmail.com> wrote: > > > > > > > Hi, I would like to get correlation coefficient that calculates obs > > 1~2, obs 1~3, obs 1~4, obs 1~5, obs 1~6 ... etc > > > Thanks, db > > > data input > > input date :mmddyy10. time :time8.0 stock1 stock2; > > format date:mmddyy10. time :time8.0; > > cards; > > 01/04/2010 7:00:00 5989.5 1115.25 > > 01/04/2010 7:01:00 5998.5 1116 > > 01/04/2010 7:01:00 5998.5 1116 > > 01/04/2010 7:02:00 5996.5 1115.75 > > 01/04/2010 7:05:00 6000 1115.5 > > 01/04/2010 20:55:00 6042 1128.5 > > 01/04/2010 20:56:00 6040.5 1128.5 > > 01/04/2010 20:57:00 6040.5 1128.75 > > 01/04/2010 20:58:00 6040.5 1128.5 > > 01/04/2010 20:59:00 6040.5 1128.5 > > 01/04/2010 21:00:00 6036 1128.75 > > 01/05/2010 7:00:00 6035.5 1128.5 > > 01/05/2010 7:01:00 6037 1128.5 > > 01/05/2010 7:02:00 6037 1128.5 > > 01/05/2010 7:03:00 6038 1128.5 > > 01/05/2010 7:06:00 6038.5 1128.25 > > 01/05/2010 20:56:00 6042 1131.5 > > 01/05/2010 20:57:00 6042.5 1132 > > 01/05/2010 20:58:00 6042 1132 > > 01/05/2010 20:59:00 6041.5 1131.75 > > 01/05/2010 21:00:00 6041.5 1132.25 > > ; > > run; > > > output > > date time stock1 stock2 R (by date) > > 01/04/2010 7:00:00 5989.5 1115.25 > > 01/04/2010 7:01:00 5998.5 1116 .9999 ( R based on obs 1~2 ) > > 01/04/2010 7:01:00 5998.5 1116 .9999 (R based on obs 1~3) > > 01/04/2010 7:02:00 5996.5 1115.75 .9999 (R based on obs 1~4) > > 01/04/2010 7:05:00 6000 1115.5 .9999 (R based on obs 1~5) > > 01/04/2010 20:55:00 6042 1128.5 .9999 (R based on obas 1~6) > > 01/04/2010 20:56:00 6040.5 1128.5 ..9999 (R based on obas 1~7) > > 01/04/2010 20:57:00 6040.5 1128.75 ..9999 (R based on obas 1~8) > > 01/04/2010 20:58:00 6040.5 1128.5 ..9999 (R based on obas 1~9) > > 01/04/2010 20:59:00 6040.5 1128.5 ..9999 (R based on obas 1~10) > > 01/04/2010 21:00:00 6036 1128.75 .9999 (R based on obs 1~ 11) > > 01/05/2010 7:00:00 6035.5 1128.5 > > 01/05/2010 7:01:00 6037 1128.5 .9999 (R based on obs 12 ~13) > > 01/05/2010 7:02:00 6037 1128.5 .9999 ( R based on obs 12 > > ~14) > > 01/05/2010 7:03:00 6038 1128.5 .9999 ( R based on obs 12 > > ~15) > > 01/05/2010 7:06:00 6038.5 1128.25 .9999 ( R based on obs 12 ~16) > > 01/05/2010 20:56:00 6042 1131.5 .9999 ( R based on obs 12 > > ~17) > > 01/05/2010 20:57:00 6042.5 1132 .9999 ( R based on obs 12 ~18) > > 01/05/2010 20:58:00 6042 1132 .9999 ( R based on obs > > 12 ~19) > > 01/05/2010 20:59:00 6041.5 1131.75 .9999 ( R based on obs 12 ~20) > > 01/05/2010 21:00:00 6041.5 1132.25 .9999 ( R based on obs 12 ~21) > > You can use proc sql to construct the needed dataset first, which also > creates a group var for later "by processing": > > data have1; > set have; > by date time; > if first.date then n_=0; > n_+1; > run; > > proc sql; > create table have2 as > select a.date,a.stock1,a.stock2,'1 to '||put(b.n_,z2.) as grp > from have1 a, have1 b > where a.date=b.date and a.n_<=b.n_ > order by a.date,grp,a.time > ; > quit; > > have2 looks like this: > > Obs date stock1 stock2 grp > > 1 01/04/2010 5989.5 1115.25 1 to 01 > 2 01/04/2010 5989.5 1115.25 1 to 02 > 3 01/04/2010 5998.5 1116.00 1 to 02 > 4 01/04/2010 5989.5 1115.25 1 to 03 > 5 01/04/2010 5998.5 1116.00 1 to 03 > 6 01/04/2010 5998.5 1116.00 1 to 03 > 7 01/04/2010 5989.5 1115.25 1 to 04 > 8 01/04/2010 5998.5 1116.00 1 to 04 > 9 01/04/2010 5998.5 1116.00 1 to 04 > 10 01/04/2010 5996.5 1115.75 1 to 04 > 11 01/04/2010 5989.5 1115.25 1 to 05 > 12 01/04/2010 5998.5 1116.00 1 to 05 > 13 01/04/2010 5998.5 1116.00 1 to 05 > 14 01/04/2010 5996.5 1115.75 1 to 05 > 15 01/04/2010 6000.0 1115.50 1 to 05 > 16 01/04/2010 5989.5 1115.25 1 to 06 > 17 01/04/2010 5998.5 1116.00 1 to 06 > 18 01/04/2010 5998.5 1116.00 1 to 06 > 19 01/04/2010 5996.5 1115.75 1 to 06 > 20 01/04/2010 6000.0 1115.50 1 to 06 > 21 01/04/2010 6042.0 1128.50 1 to 06 > .... > > With have2, you can use proc corr and "by processing" to get what you > need: > > ods listing close; > ods output PearsonCorr=need(where=(variable='stock2')); > > proc corr data=have2; > by date grp; > var stock1 stock2; > run; > > ods listing; > > proc print label; > var date grp stock1; > label stock1='R'; > run; > > Obs date grp R > > 1 01/04/2010 1 to 01 . > 2 01/04/2010 1 to 02 1.00000 > 3 01/04/2010 1 to 03 1.00000 > 4 01/04/2010 1 to 04 0.99313 > 5 01/04/2010 1 to 05 0.67305 > 6 01/04/2010 1 to 06 0.98630 > 7 01/04/2010 1 to 07 0.99139 > 8 01/04/2010 1 to 08 0.99308 > 9 01/04/2010 1 to 09 0.99411 > 10 01/04/2010 1 to 10 0.99474 > 11 01/04/2010 1 to 11 0.99258 > 12 01/05/2010 1 to 01 . > 13 01/05/2010 1 to 02 . > 14 01/05/2010 1 to 03 . > 15 01/05/2010 1 to 04 . > 16 01/05/2010 1 to 05 -0.63134 > 17 01/05/2010 1 to 06 0.85891 > 18 01/05/2010 1 to 07 0.91995 > 19 01/05/2010 1 to 08 0.93372 > 20 01/05/2010 1 to 09 0.93834 > 21 01/05/2010 1 to 10 0.93508 > > HTH > > Ya Hi Ya, Thank you so much for your logic! your logic works really good. I need to run this logic on the dataset with about 1 million observation. Is there a way to omit proc sql processing to calculate cumulative correlation coefficient ? Thanks, db |