From: Patrick on 9 Nov 2009 01:53 data have; input id $ date:MMDDYY12. price; format date date9.; if lag(id) ne id or lag(price) ne price then group_id+1;; cards; a 1/1/2009 3 a 1/2/2009 3 a 1/3/2009 3 a 1/4/2009 0.30 a 1/5/2009 0.29 a 1/6/2009 3 a 1/7/2009 3 a 1/8/2009 0.62 a 1/9/2009 0.84 a 1/10/2009 0.33 b 1/1/2009 0.48 b 1/2/2009 0.09 b 1/3/2009 0.67 b 1/4/2009 0.91 b 1/5/2009 4 b 1/6/2009 4 b 1/7/2009 4 b 1/8/2009 4 b 1/9/2009 0.66 b 1/10/2009 0.18 c 1/1/2009 5 c 1/2/2009 5 c 1/3/2009 0.30 c 1/4/2009 0.78 c 1/5/2009 0.08 c 1/6/2009 5 c 1/7/2009 5 c 1/8/2009 5 c 1/9/2009 5 c 1/10/2009 5 ; proc sql; select * from have where group_id in ( select group_id from have group by id,group_id having count(group_id)<3 ) ; quit; HTH Patrick
From: Ruslan Kirhizau on 9 Nov 2009 02:17 Hi=2C =20 Below lines are to propose at least an idea for solving the problem.=20 =20 data have=3B set mydata =3B=20 by id price notsorted=3B =20 if first.price then group + 1=3B /*create a group number for repeated co= nsecutive price*/=20 run=3B=20 =20 proc sql=3B=20 create table final (drop =3D group ) as=20 select * from have group by id =2C group having count (group) lt 3=20 order by id=2C date=2C price =3B=20 quit=3B =20 Best of luck=2C =20 Ruslan =20 > Date: Sun=2C 8 Nov 2009 23:19:40 -0500 > From: haigang.zhou(a)GMAIL.COM > Subject: Delete adjacent obs with repeated info > To: SAS-L(a)LISTSERV.UGA.EDU >=20 > I have a dataset of daily stock prices with three variables: ID=2C date= =2C and > price. >=20 > For some reason=2C the dataset reports the same stock price for some adja= cent > days. I want to write a program that deletes those days once the number o= f > adjacent days with the same price reaches certain threshold=2C say 3. For > example=2C the first three days for "a" should be deleted=2C but not the = 6th and > 7th days. Similarly=2C the last 5 days for stock "c" should be deleted bu= t not > the first 2 days. >=20 > Can someone help me write the code? Many thanks. >=20 > data mydata=3B > input id $ date MMDDYY12. price=3B > cards=3B > a 1/1/2009 3 > a 1/2/2009 3 > a 1/3/2009 3 > a 1/4/2009 0.30 > a 1/5/2009 0.29 > a 1/6/2009 3 > a 1/7/2009 3 > a 1/8/2009 0.62 > a 1/9/2009 0.84 > a 1/10/2009 0.33 > b 1/1/2009 0.48 > b 1/2/2009 0.09 > b 1/3/2009 0.67 > b 1/4/2009 0.91 > b 1/5/2009 4 > b 1/6/2009 4 > b 1/7/2009 4 > b 1/8/2009 4 > b 1/9/2009 0.66 > b 1/10/2009 0.18 > c 1/1/2009 5 > c 1/2/2009 5 > c 1/3/2009 0.30 > c 1/4/2009 0.78 > c 1/5/2009 0.08 > c 1/6/2009 5 > c 1/7/2009 5 > c 1/8/2009 5 > c 1/9/2009 5 > c 1/10/2009 5 > =3B > run=3B =20 _________________________________________________________________ Windows 7: Unclutter your desktop. http://go.microsoft.com/?linkid=3D9690331&ocid=3DPID24727::T:WLMTAGL:ON:WL:= en-US:WWL_WIN_evergreen:112009=
From: Fernández Rodríguez, on 9 Nov 2009 04:42 Hi Haigang, I use SAS to 'play' with stock prices datasets but I don�t understand what you want to get; you want to delete the first x rows for 'A' shares and you want to delete the last x rows for 'C' shares at the same time. Could you explain more detailed conditions to reject those rows,please? Daniel Fernandez Barcelona -----Mensaje original----- De: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] En nombre de Haigang Zhou Enviado el: dilluns, 9 / novembre / 2009 05:20 Para: SAS-L(a)LISTSERV.UGA.EDU Asunto: Delete adjacent obs with repeated info I have a dataset of daily stock prices with three variables: ID, date, and price. For some reason, the dataset reports the same stock price for some adjacent days. I want to write a program that deletes those days once the number of adjacent days with the same price reaches certain threshold, say 3. For example, the first three days for "a" should be deleted, but not the 6th and 7th days. Similarly, the last 5 days for stock "c" should be deleted but not the first 2 days. Can someone help me write the code? Many thanks. data mydata; input id $ date MMDDYY12. price; cards; a 1/1/2009 3 a 1/2/2009 3 a 1/3/2009 3 a 1/4/2009 0.30 a 1/5/2009 0.29 a 1/6/2009 3 a 1/7/2009 3 a 1/8/2009 0.62 a 1/9/2009 0.84 a 1/10/2009 0.33 b 1/1/2009 0.48 b 1/2/2009 0.09 b 1/3/2009 0.67 b 1/4/2009 0.91 b 1/5/2009 4 b 1/6/2009 4 b 1/7/2009 4 b 1/8/2009 4 b 1/9/2009 0.66 b 1/10/2009 0.18 c 1/1/2009 5 c 1/2/2009 5 c 1/3/2009 0.30 c 1/4/2009 0.78 c 1/5/2009 0.08 c 1/6/2009 5 c 1/7/2009 5 c 1/8/2009 5 c 1/9/2009 5 c 1/10/2009 5 ; run;
From: Haigang Zhou on 9 Nov 2009 08:29 Hi Barcelona, I want to delete the first 3 rows of A, middle four rows of B, and the last five rows of C. The condition to delete them is that more than three adjacent days report the same stock price. Many thanks. Haigang 2009/11/9 Fern=E1ndez Rodr=EDguez, Dani <DFernandez(a)cst.cat> > Hi Haigang, > > I use SAS to 'play' with stock prices datasets but I don=B4t understand > what you want to get; you want to delete the first x rows for 'A' shares > and you want to delete the last x rows for 'C' shares at the same time. > Could you explain more detailed conditions to reject those rows,please? > > Daniel Fernandez > Barcelona > > > -----Mensaje original----- > De: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] En nombre de Haigan= g > Zhou > Enviado el: dilluns, 9 / novembre / 2009 05:20 > Para: SAS-L(a)LISTSERV.UGA.EDU > Asunto: Delete adjacent obs with repeated info > > I have a dataset of daily stock prices with three variables: ID, date, an= d > price. > > For some reason, the dataset reports the same stock price for some adjace= nt > days. I want to write a program that deletes those days once the number o= f > adjacent days with the same price reaches certain threshold, say 3. For > example, the first three days for "a" should be deleted, but not the 6th > and > 7th days. Similarly, the last 5 days for stock "c" should be deleted but > not > the first 2 days. > > Can someone help me write the code? Many thanks. > > data mydata; > input id $ date MMDDYY12. price; > cards; > a 1/1/2009 3 > a 1/2/2009 3 > a 1/3/2009 3 > a 1/4/2009 0.30 > a 1/5/2009 0.29 > a 1/6/2009 3 > a 1/7/2009 3 > a 1/8/2009 0.62 > a 1/9/2009 0.84 > a 1/10/2009 0.33 > b 1/1/2009 0.48 > b 1/2/2009 0.09 > b 1/3/2009 0.67 > b 1/4/2009 0.91 > b 1/5/2009 4 > b 1/6/2009 4 > b 1/7/2009 4 > b 1/8/2009 4 > b 1/9/2009 0.66 > b 1/10/2009 0.18 > c 1/1/2009 5 > c 1/2/2009 5 > c 1/3/2009 0.30 > c 1/4/2009 0.78 > c 1/5/2009 0.08 > c 1/6/2009 5 > c 1/7/2009 5 > c 1/8/2009 5 > c 1/9/2009 5 > c 1/10/2009 5 > ; > run; >
From: Haigang Zhou on 9 Nov 2009 08:33
Ruslan, Many thanks for the codes. They serve my purpose very well. Haigang On Mon, Nov 9, 2009 at 2:17 AM, Ruslan Kirhizau <kirhizau(a)hotmail.com>wrote: > Hi, > > Below lines are to propose at least an idea for solving the problem. > > * data* have; > set mydata ; > by id price notsorted; > if first.price then group + *1*; /*create a group number for repeated > consecutive price*/ > * run*; > > > * proc* *sql*; > create table final (drop = group ) as > select * > from have > group by id , group > having count (group) lt *3* > order by id, date, price ; > quit; > > Best of luck, > > Ruslan > > > > > > > Date: Sun, 8 Nov 2009 23:19:40 -0500 > > From: haigang.zhou(a)GMAIL.COM > > Subject: Delete adjacent obs with repeated info > > To: SAS-L(a)LISTSERV.UGA.EDU > > > > I have a dataset of daily stock prices with three variables: ID, date, > and > > price. > > > > For some reason, the dataset reports the same stock price for some > adjacent > > days. I want to write a program that deletes those days once the number > of > > adjacent days with the same price reaches certain threshold, say 3. For > > example, the first three days for "a" should be deleted, but not the 6th > and > > 7th days. Similarly, the last 5 days for stock "c" should be deleted but > not > > the first 2 days. > > > > Can someone help me write the code? Many thanks. > > > > data mydata; > > input id $ date MMDDYY12. price; > > cards; > > a 1/1/2009 3 > > a 1/2/2009 3 > > a 1/3/2009 3 > > a 1/4/2009 0.30 > > a 1/5/2009 0.29 > > a 1/6/2009 3 > > a 1/7/2009 3 > > a 1/8/2009 0.62 > > a 1/9/2009 0.84 > > a 1/10/2009 0.33 > > b 1/1/2009 0.48 > > b 1/2/2009 0.09 > > b 1/3/2009 0.67 > > b 1/4/2009 0.91 > > b 1/5/2009 4 > > b 1/6/2009 4 > > b 1/7/2009 4 > > b 1/8/2009 4 > > b 1/9/2009 0.66 > > b 1/10/2009 0.18 > > c 1/1/2009 5 > > c 1/2/2009 5 > > c 1/3/2009 0.30 > > c 1/4/2009 0.78 > > c 1/5/2009 0.08 > > c 1/6/2009 5 > > c 1/7/2009 5 > > c 1/8/2009 5 > > c 1/9/2009 5 > > c 1/10/2009 5 > > ; > > run; > > ------------------------------ > Windows 7: Unclutter your desktop. Learn more.<http://go.microsoft.com/?linkid=9690331&ocid=PID24727::T:WLMTAGL:ON:WL:en-US:WWL_WIN_evergreen:112009> > |