Prev: SAS - proc nlmixed & Cholesky Decomposition
Next: BI Publisher Consultants - Woodland Hills CA - 10+ months
From: billyiqing on 6 Aug 2010 05:47 Hi Patrick The data looks something like this: Ticker Date P 1 01Jan2010 5 1 02Jan2010 5 1 03Jan2010 6 1 04Jan2010 6.5 1 05Jan2010 6 2 01Jan2010 12 2 02Jan2010 12.4 2 03Jan2010 12.3 2 04Jan2010 13 2 05Jan2010 12.8 2 05Jan2010 14 And I want something like this: Ticker Date P High 1 01Jan2010 5 5 1 02Jan2010 5 5 1 03Jan2010 6 6 1 04Jan2010 6.5 6.5 1 05Jan2010 6 6.5 2 01Jan2010 12 12 2 02Jan2010 12.4 12.4 2 03Jan2010 12.3 12.4 2 04Jan2010 13 13 2 05Jan2010 12.8 13 2 05Jan2010 14 14 The result data needs to contain ALL observations in the original dataset, with an additional column containing the past 52-week high for each observation. Each ticker would have the stock price for the stock's entire period, therefore the the Ticker-Date is unique. But neither the ticker itself nor the date itself is unique. The date spans across several years - sometimes really long periods. Thank you SO much for your help!!!! On Aug 6, 7:38 pm, Patrick <patrick.mat...(a)gmx.ch> wrote: > Hi Bill > > Just to clarify before I'm coming up with something - please correct > the following statements if wrong: > > - The result data set contains all the observations from the source > data set which fall into the date range of the last 52 weeks. > - The combination of ticker and date is not unique. > - The variable 'high' (max. price) contains the max value of > 'p' (price) for each combination of ticker and date. > > i.e. Have > > Ticker Date P > 1 01jan2010 5 > 1 01jan2010 4 > 1 01jan2010 1 > 1 02jan2010 6 > 1 02jan2010 9 > 1 02jan2010 3 > > Want: > Ticker Date High > 1 01jan2010 5 > 1 01jan2010 5 > 1 01jan2010 5 > 1 02jan2010 9 > 1 02jan2010 9 > 1 02jan2010 9 > > Thanks > Patrick
From: Patrick on 6 Aug 2010 10:34 Hi Bill Below code which (I think) solves the problem and should perform reasonably well. Everything else I can think of right now wouldn't add much performance gain but complicate the code a lot. If you're not familiar with the hash object: Besides of the SASDoc there are also a few good white papers around. I find also this quick reference very helpful: http://support.sas.com/rnd/base/datastep/dot/hash-tip-sheet.pdf data have; infile datalines delimiter=' '; input Ticker Date:date9. P; format date date9.; datalines; 1 01May2008 50 1 01Feb2009 3 1 01Mar2009 2 1 01Apr2009 4 1 01May2009 3 1 01Jun2009 1 1 01Jul2009 3 1 01Aug2009 4 1 01Jan2010 5 1 02Jan2010 5 1 03Jan2010 6 1 04Jan2010 6.5 1 05Jan2010 6 2 01Jan2010 12 2 02Jan2010 12.4 2 03Jan2010 12.3 2 04Jan2010 13 2 05Jan2010 12.8 2 05Jan2010 14 ; run; /* number of weeks to look at */ %let J=52; /* select all obs within date range and order result set by ticker and date */ proc sql; create view Vhave_Sorted as select * from have order by ticker,date ; quit; data want(keep=Ticker Date p high); set Vhave_Sorted; by ticker date; ThisDate=date; /* declare hash table for look up */ if _n_=1 then do; declare hash h (); rc=h.defineKey('ticker','ThisDate'); rc=h.defineData('p'); rc=h.defineDone(); end; /* clear all entries in hash table if a new ticker starts */ if first.ticker then do; rc=rc = h.clear(); end; /* add keys and data as in current record as defined to hash table */ rc=h.add(); /* lookup data in hash table and keep the max data value in the last 52 weeks */ /* starting from the current date value */ do ThisDate=(date- 7*&j) to date; rc=h.find(); high=max(high,p); end; run; proc print data=want noobs; run; I was not 100% sure how exactly the date range has to be defined. You might have to amend it to suit your needs. What I did is just look back at the last 7*52 days: -> do ThisDate=(date- 7*&j) to date; .....or should it be 7*52+1 - or is it the last 52 weeks before the date in current week (you would need intnx() for this). I'm sure you'll manage to sort this out ;-) Let me know if the code worked for you. HTH Patrick
From: Patrick on 6 Aug 2010 10:37 Hi Andrew Could you provide a code example which solves Bill's problem (he gave us now some sample data)? I'm really curious how this could be done with PROC EXPAND only. Thanks Patrick
From: Patrick on 6 Aug 2010 11:17 Hi Bill And last but not least (seem I got inspired...): In case that you have a lot of RAM (>4GB addressable) and you don't need your result data set sorted by ticker and date then loading the full table into a hash for lookup would avoid the necessity for sorting (which costs a lot of resources and time). data have; infile datalines delimiter=' '; input Ticker Date:date9. P; format date date9.; datalines; 1 01Feb2009 3 1 01Mar2009 2 1 01Apr2009 4 1 01May2009 3 1 01Jun2009 1 1 01Jul2009 3 1 01Aug2009 4 1 01Jan2010 5 1 01May2008 50 1 02Jan2010 5 1 03Jan2010 6 1 04Jan2010 6.5 1 05Jan2010 6 2 01Jan2010 12 2 02Jan2010 12.4 2 03Jan2010 12.3 2 04Jan2010 13 2 05Jan2010 12.8 2 05Jan2010 14 ; run; /* number of weeks to look at */ %let J=52; data want(keep=Ticker Date p high); set have; /* declare hash table for look up */ if _n_=1 then do; declare hash h (dataset:'work.have'); rc=h.defineKey('ticker','Date'); rc=h.defineData('p'); rc=h.defineDone(); end; /* lookup data in hash table and keep the max data value in the last 52 weeks */ /* starting from the current date value */ ThisDate=date; do date=(ThisDate- 7*&j) to date; rc=h.find(); high=max(high,p); end; date=ThisDate; run; proc print data=want noobs; run; If you want to use this code then you have to pre-define the size of the hash table - which can be tricky. The default is very likely to small and not efficient. Just two links dealing with this hashexp question: http://books.google.com.au/books?id=PplQOP__uhcC&pg=PA397&lpg=PA397&dq=sas+hashexp+calculate&source=bl&ots=u3MMo0v2g1&sig=igTYtfGOKMDMm7EEksxHBuu-La4&hl=en&ei=kiJcTMnIOZKecdvbjPsB&sa=X&oi=book_result&ct=result&resnum=6&ved=0CDEQ6AEwBQ#v=onepage&q&f=false http://support.sas.com/kb/34/193.html I'm guilty of not having defined the hash table size in the code I provided. For the previous example the hash table stays quite small (less than 3000 records for 7.5 years and one ticker). It might be worth to define a hashexp of 2 only (not sure about this - that's why I just ignored it). HTH Patrick
First
|
Prev
|
Pages: 1 2 Prev: SAS - proc nlmixed & Cholesky Decomposition Next: BI Publisher Consultants - Woodland Hills CA - 10+ months |