Prev: Lab Tables
Next: Need help with %qscan and %scan
From: hs AT dc-sug DOT org on 24 Dec 2008 14:42 See http://www.sascommunity.org/wiki/Rolling_Calculations On Wed, 24 Dec 2008 13:11:16 -0600, ST <sheepsas(a)GMAIL.COM> wrote: >Hi there, > >I have a dataset which has 10000 observations. For each obs, I want to >create a regression model (based on variable x and y in the dataset) using >previous 20 observations and save the estimated slope as a new variable in >the dataset. Totally, I will have 9980 slopes. > >Can someone show me how to do this efficiently? > >Many thanks, > >Hu
From: Muthia Kachirayan on 24 Dec 2008 18:42 On Wed, Dec 24, 2008 at 2:11 PM, ST <sheepsas(a)gmail.com> wrote: > Hi there, > > I have a dataset which has 10000 observations. For each obs, I want to > create a regression model (based on variable x and y in the dataset) using > previous 20 observations and save the estimated slope as a new variable in > the dataset. Totally, I will have 9980 slopes. > > Can someone show me how to do this efficiently? > > Many thanks, > > Hu > A most efficient way to do this MOVING REGRESSION lies in not using the Proc REG. Hold x and y for a size of WSIZE at a time, sliding up by removing one at the left and adding one at the right. The slope is detemined by (Sum(XY) - Sum(X) * SUM(Y) / WSIZE) / (SUM(X * X) - SUM(X) ** 2 / WSIZE) If y-intercept is also needed, the following statement can be included. y_intercept = sumy / &wsize - slope * sumx / &wsize; The following code is an adaption of Howard who posted in SAS-L to find moving Standard Deviations. %let wsize=7; data c(keep = x y slope); array xx(&wsize) (&wsize * 0); array yy(&wsize) (&wsize * 0); retain xx : yy :; x2remove = xx(mod(_n_ - 1, &wsize) + 1); y2remove = yy(mod(_n_ - 1, &wsize) + 1); set sashelp.class(keep = weight height rename = (weight = y height = x)) ; xx(mod(_n_ - 1, &wsize) + 1) = x; yy(mod(_n_ - 1, &wsize) + 1) = y; sumx + x - x2remove ; sumxsq + x**2 - x2remove**2; sumy + y - y2remove; sumxy + x * y - x2remove * y2remove; if _n_ >= &wsize then slope = (sumxy - sumx * sumy / &wsize)/(sumxsq - sumx ** 2 / &wsize); run; Happy Holidays to All. Muthia Kachirayan
From: Bill West on 25 Dec 2008 10:32 I see you've received several suggested methods and I may be showing my ignorance but I'm not sure what you mean by “ save the estimated slope as a new variable. if you mean a predicted value I believe the regout dataset below will include a predicted value for each obs. Is that what you want? data expanded.; set file; if 1< =_N_ <=20 then span=1; if 21< =_N_ <=30 then span=2; if 31< =_N_ <=40 then span=3; etc up to 500; proc reg data=expanded noprint outest=regout ; by span; model x =y;
From: Bill West on 25 Dec 2008 11:08 On Wed, 24 Dec 2008 18:42:30 -0500, Muthia Kachirayan <muthia.kachirayan(a)GMAIL.COM> wrote: >On Wed, Dec 24, 2008 at 2:11 PM, ST <sheepsas(a)gmail.com> wrote: > >> Hi there, >> >> I have a dataset which has 10000 observations. For each obs, I want to >> create a regression model (based on variable x and y in the dataset) using >> previous 20 observations and save the estimated slope as a new variable in >> the dataset. Totally, I will have 9980 slopes. >> >> Can someone show me how to do this efficiently? >> >> Many thanks, >> >> Hu >> > Hi Hu , I see you've received several suggested methods and I may be showing my ignorance but I'm not sure what you mean by “ save the estimated slope as a new variable. if you mean a predicted value I believe the regout dataset below will include a predicted value for each obs. Is that what you want? data expanded.; set file; if 1< =_N_ <=20 then span=1; if 21< =_N_ <=30 then span=2; if 31< =_N_ <=40 then span=3; etc up to 500; proc reg data=expanded noprint outest=regout ; by span; model x =y;
From: Muthia Kachirayan on 25 Dec 2008 12:06 On Thu, Dec 25, 2008 at 11:08 AM, Bill West <wm_a_west(a)yahoo.com> wrote: > On Wed, 24 Dec 2008 18:42:30 -0500, Muthia Kachirayan > <muthia.kachirayan(a)GMAIL.COM> wrote: > > >On Wed, Dec 24, 2008 at 2:11 PM, ST <sheepsas(a)gmail.com> wrote: > > > >> Hi there, > >> > >> I have a dataset which has 10000 observations. For each obs, I want to > >> create a regression model (based on variable x and y in the dataset) > using > >> previous 20 observations and save the estimated slope as a new variabl= e > in > >> the dataset. Totally, I will have 9980 slopes. > >> > >> Can someone show me how to do this efficiently? > >> > >> Many thanks, > >> > >> Hu > >> > > > Hi Hu , > I see you've received several suggested methods and I may be showing my > ignorance but I'm not sure what you mean by =E2=80=9C save the estimated = slope as > a new variable. > if you mean a predicted value > I believe the regout dataset below will include a predicted value for > each obs. Is that what you want? > > > data expanded.; > set file; > if 1< =3D_N_ <=3D20 then span=3D1; > if 21< =3D_N_ <=3D30 then span=3D2; > if 31< =3D_N_ <=3D40 then span=3D3; > etc up to 500; > > proc reg data=3Dexpanded noprint outest=3Dregout ; > by span; > model x =3Dy; > The OP wanted the slope for each of the set of obsevations (_N_ ) between 1 to 20, 2 to 21, 3 to 22, 4 to 23, 10,081 to 10,000. This is similar to finding Moving Averages, having overlapping observations= , leaving one at the left and adding one to the right . Thus there will be 9980 slopes(10,000 - 20). The use of if 1< =3D_N_ <=3D20 then span=3D1; if 21< =3D_N_ <=3D30 then span=3D2; if 31< =3D_N_ <=3D40 then span=3D3; etc up to 500; will result in slopes for segmented observations and essentially not a moving average slope. My code does the moving average slope by computing 4 sums (Sum(X), Sum(Y), SUM(X * X) and Sum(X * Y) on a sliding scale. Hope this clarifies. Kind regards, Muthia Kachirayan
|
Pages: 1 Prev: Lab Tables Next: Need help with %qscan and %scan |