Prev: 1)if and if-else 2)if and where 3)keep=option and keep statement4)input function and input statement 5)scan and substr6)if and if then do7)informat and format i had good knowledge about those difference with examples.
Next: Spectral Analysis
From: ChrisG on 22 Feb 2010 11:06 Hi everyone Here is my tricky sum problem I have such a dataset id_bb d0 d1 d2 d3 1 10 0 0 0 1 8 5 0 0 1 6 10 6 0 1 4 7 4 3 2 8 0 0 0 2 4 9 0 0 2 7 8 5 0 2 10 3 10 6 id_bb = identifier, then some numeric variables d0-d3 in reality i have more than 50000 lines, more than 3000 unique identifiers and it goes from d0 until d125 ! now my dream is to have this: id_bb d0 d1 d2 d3 x 1 10 0 0 0 28 1 8 5 0 0 22 1 6 10 6 0 10 1 4 7 4 3 3 2 8 0 0 0 29 2 4 9 0 0 20 2 7 8 5 0 15 2 10 3 10 6 6 this x variable is the trick sum i need and it goes like this: for the first id_bb: x in the first line = sum(d0), i.e. 10+8+6+4 x in the second line = sum(d1), i.e. 0+5+10+7 so on until the last id_bb=1, i.e. x = sum(d3) = 0+0+0+3 then again for the second id_bb: x in the first line = sum(d0), ie. x = 8+4+7+10 and so on until the last variable, ie. d3 and at the end, in reality, this should happen for every id_bb (more than 3000 unique id_bb) and for the d0 until d125... i have no clue how to do it in SAS... in Excel this will surely be a pain as well and i don't know how to use VBA... So if someone could help me out with this, i would be very greatful ! thank you in advance Best CG
From: Jack Clark on 22 Feb 2010 14:18 Chris, If I understand the question correctly, this means that each ID_BB value in your data has exactly 126 observations? This is because you will need 126 observations per ID to hold the sums of D0-D125. Is that correct? This solution includes the creation of a new variable called SUM_VAR which has the value of the variable name being summed as the value of X. You can drop it at the end if you don't want it. The data are summarized by ID_BB to get the sum of each numeric variable. That summarized data is transposed and then merged to the original data by ID_BB and SUM_VAR. Let me know if this meets your needs. *(using your sample data); data have; infile cards missover; input id_bb d0 d1 d2 d3; cards; 1 10 0 0 0 1 8 5 0 0 1 6 10 6 0 1 4 7 4 3 2 8 0 0 0 2 4 9 0 0 2 7 8 5 0 2 10 3 10 6 ; run; * create SUM_VAR containing variable name of summed variable ; data next (drop=arr_cnt); set have; by id_bb; array dvars(*) d0-d3; if first.id_bb then arr_cnt = 1; sum_var = vname(dvars(arr_cnt)); arr_cnt + 1; run; proc print data = next; run; * sum numeric fields by ID_BB ; proc summary data = have nway missing; class id_bb; var d0-d3; output out = have_sum (drop=_type_ _freq_) sum=; run; proc print data = have_sum; run; * transpose summed data - rename fields ; proc transpose data = have_sum out = have_sumt (rename=(_name_=sum_var col1=x)); by id_bb; run; proc print data = have_sumt; run; * merge summed data to original data ; data need; merge next have_sumt; by id_bb sum_var; run; proc print data = need; run; Jack Jack Clark Senior Research Analyst phone: 410-455-6256 fax: 410-455-6850 jclark(a)hilltop.umbc.edu University of Maryland, Baltimore County Sondheim Hall, 3rd Floor 1000 Hilltop Circle Baltimore, MD 21250 Confidentiality Notice: This e-mail may contain information that is legally privileged and that is intended only for the use of the addressee(s) named above. If you are not the intended recipient, you are hereby notified that any disclosure, copying of this e-mail, distribution, or action taken in reliance on the contents of this e-mail and/or documents attributed to this e-mail is strictly prohibited. If you have received this information in error, please notify the sender immediately by phone and delete this entire e-mail. Thank you.-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of ChrisG Sent: Monday, February 22, 2010 11:07 AM To: SAS-L(a)LISTSERV.UGA.EDU Subject: tricky sum Hi everyone Here is my tricky sum problem I have such a dataset id_bb d0 d1 d2 d3 1 10 0 0 0 1 8 5 0 0 1 6 10 6 0 1 4 7 4 3 2 8 0 0 0 2 4 9 0 0 2 7 8 5 0 2 10 3 10 6 id_bb = identifier, then some numeric variables d0-d3 in reality i have more than 50000 lines, more than 3000 unique identifiers and it goes from d0 until d125 ! now my dream is to have this: id_bb d0 d1 d2 d3 x 1 10 0 0 0 28 1 8 5 0 0 22 1 6 10 6 0 10 1 4 7 4 3 3 2 8 0 0 0 29 2 4 9 0 0 20 2 7 8 5 0 15 2 10 3 10 6 6 this x variable is the trick sum i need and it goes like this: for the first id_bb: x in the first line = sum(d0), i.e. 10+8+6+4 x in the second line = sum(d1), i.e. 0+5+10+7 so on until the last id_bb=1, i.e. x = sum(d3) = 0+0+0+3 then again for the second id_bb: x in the first line = sum(d0), ie. x = 8+4+7+10 and so on until the last variable, ie. d3 and at the end, in reality, this should happen for every id_bb (more than 3000 unique id_bb) and for the d0 until d125... i have no clue how to do it in SAS... in Excel this will surely be a pain as well and i don't know how to use VBA... So if someone could help me out with this, i would be very greatful ! thank you in advance Best CG
From: NordlDJ on 22 Feb 2010 14:47 > -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of > ChrisG > Sent: Monday, February 22, 2010 8:07 AM > To: SAS-L(a)LISTSERV.UGA.EDU > Subject: tricky sum > > Hi everyone > > Here is my tricky sum problem > I have such a dataset > > id_bb d0 d1 d2 d3 > 1 10 0 0 0 > 1 8 5 0 0 > 1 6 10 6 0 > 1 4 7 4 3 > 2 8 0 0 0 > 2 4 9 0 0 > 2 7 8 5 0 > 2 10 3 10 6 > > id_bb = identifier, then some numeric variables d0-d3 > in reality i have more than 50000 lines, more than 3000 unique > identifiers and it goes from d0 until d125 ! > > now my dream is to have this: > > id_bb d0 d1 d2 d3 x > 1 10 0 0 0 28 > 1 8 5 0 0 22 > 1 6 10 6 0 10 > 1 4 7 4 3 3 > 2 8 0 0 0 29 > 2 4 9 0 0 20 > 2 7 8 5 0 15 > 2 10 3 10 6 6 > > this x variable is the trick sum i need > and it goes like this: > for the first id_bb: > x in the first line = sum(d0), i.e. 10+8+6+4 > x in the second line = sum(d1), i.e. 0+5+10+7 > so on until the last id_bb=1, i.e. x = sum(d3) = 0+0+0+3 > then again for the second id_bb: > x in the first line = sum(d0), ie. x = 8+4+7+10 > and so on until the last variable, ie. d3 > > and at the end, in reality, this should happen for every id_bb (more > than 3000 unique id_bb) and for the d0 until d125... > > i have no clue how to do it in SAS... > in Excel this will surely be a pain as well and i don't know how to > use VBA... > > So if someone could help me out with this, i would be very greatful ! > thank you in advance > Best > CG Chris, I am not sure if I understand your wanted solution. You say in your real data you have variables d0-d125 that you want to sum up and then append to consecutive observations by id_bb. That will only work if you exactly 126 observations for each ID. If that is what you have then the following code will work. Change the indices of 3 to 125 for your data. data have; infile cards; input id_bb d0 d1 d2 d3; cards; 1 10 0 0 0 1 8 5 0 0 1 6 10 6 0 1 4 7 4 3 2 8 0 0 0 2 4 9 0 0 2 7 8 5 0 2 10 3 10 6 ; run; data want; do until(last.id_bb); set have; by id_bb; array d_a[0:3] d0-d3; array sum_a[0:3] s0-s3; do i=0 to 3; sum_a[i] = sum(0,sum_a[i],d_a[i]); end; end; do i=0 to 3; set have; x = sum_a[i]; output; end; run; proc print; run; I you don't have 126 observations per ID then you will need to write back to SAS-L and give us more detail on exactly what you want. Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204
From: Mike Zdeb on 22 Feb 2010 14:32 hi ... if you can live with a MERGE with no BY variable, this should work since the data set TSTATS should (will) be in the correct order after PROC TRANSPOSE you can scale it up for your data (VAR D0-D125 in PROC SUMMARy) ... data x; input id_bb d0-d3; datalines; 1 10 0 0 0 1 8 5 0 0 1 6 10 6 0 1 4 7 4 3 2 8 0 0 0 2 4 9 0 0 2 7 8 5 0 2 10 3 10 6 ; run; proc summary data=x nway; var d0-d3; class id_bb; output out=stats (drop=_:) sum=; run; proc transpose data=stats out=tstats (keep=col1 rename=(col1=x)); by id_bb; run; options mergenoby=warn; data x; merge x tstats; run; options mergenoby=error; proc print data=x; run; output ... Obs ID_BB D0 D1 D2 D3 X 1 1 10 0 0 0 28 2 1 8 5 0 0 22 3 1 6 10 6 0 10 4 1 4 7 4 3 3 5 2 8 0 0 0 29 6 2 4 9 0 0 20 7 2 7 8 5 0 15 8 2 10 3 10 6 6 -- Mike Zdeb U(a)Albany School of Public Health One University Place Rensselaer, New York 12144-3456 P/518-402-6479 F/630-604-1475 > Hi everyone > > Here is my tricky sum problem > I have such a dataset > > id_bb d0 d1 d2 d3 > 1 10 0 0 0 > 1 8 5 0 0 > 1 6 10 6 0 > 1 4 7 4 3 > 2 8 0 0 0 > 2 4 9 0 0 > 2 7 8 5 0 > 2 10 3 10 6 > > id_bb = identifier, then some numeric variables d0-d3 > in reality i have more than 50000 lines, more than 3000 unique > identifiers and it goes from d0 until d125 ! > > now my dream is to have this: > > id_bb d0 d1 d2 d3 x > 1 10 0 0 0 28 > 1 8 5 0 0 22 > 1 6 10 6 0 10 > 1 4 7 4 3 3 > 2 8 0 0 0 29 > 2 4 9 0 0 20 > 2 7 8 5 0 15 > 2 10 3 10 6 6 > > this x variable is the trick sum i need > and it goes like this: > for the first id_bb: > x in the first line = sum(d0), i.e. 10+8+6+4 > x in the second line = sum(d1), i.e. 0+5+10+7 > so on until the last id_bb=1, i.e. x = sum(d3) = 0+0+0+3 > then again for the second id_bb: > x in the first line = sum(d0), ie. x = 8+4+7+10 > and so on until the last variable, ie. d3 > > and at the end, in reality, this should happen for every id_bb (more > than 3000 unique id_bb) and for the d0 until d125... > > i have no clue how to do it in SAS... > in Excel this will surely be a pain as well and i don't know how to > use VBA... > > So if someone could help me out with this, i would be very greatful ! > thank you in advance > Best > CG >
From: ChrisG on 22 Feb 2010 15:42
On 22 fév, 20:47, Nord...(a)DSHS.WA.GOV ("Nordlund, Dan (DSHS/RDA)") wrote: > > -----Original Message----- > > From: SAS(r) Discussion [mailto:SA...(a)LISTSERV.UGA.EDU] On Behalf Of > > ChrisG > > Sent: Monday, February 22, 2010 8:07 AM > > To: SA...(a)LISTSERV.UGA.EDU > > Subject: tricky sum > > > Hi everyone > > > Here is my tricky sum problem > > I have such a dataset > > > id_bb d0 d1 d2 d3 > > 1 10 0 0 0 > > 1 8 5 0 0 > > 1 6 10 6 0 > > 1 4 7 4 3 > > 2 8 0 0 0 > > 2 4 9 0 0 > > 2 7 8 5 0 > > 2 10 3 10 6 > > > id_bb = identifier, then some numeric variables d0-d3 > > in reality i have more than 50000 lines, more than 3000 unique > > identifiers and it goes from d0 until d125 ! > > > now my dream is to have this: > > > id_bb d0 d1 d2 d3 x > > 1 10 0 0 0 28 > > 1 8 5 0 0 22 > > 1 6 10 6 0 10 > > 1 4 7 4 3 3 > > 2 8 0 0 0 29 > > 2 4 9 0 0 20 > > 2 7 8 5 0 15 > > 2 10 3 10 6 6 > > > this x variable is the trick sum i need > > and it goes like this: > > for the first id_bb: > > x in the first line = sum(d0), i.e. 10+8+6+4 > > x in the second line = sum(d1), i.e. 0+5+10+7 > > so on until the last id_bb=1, i.e. x = sum(d3) = 0+0+0+3 > > then again for the second id_bb: > > x in the first line = sum(d0), ie. x = 8+4+7+10 > > and so on until the last variable, ie. d3 > > > and at the end, in reality, this should happen for every id_bb (more > > than 3000 unique id_bb) and for the d0 until d125... > > > i have no clue how to do it in SAS... > > in Excel this will surely be a pain as well and i don't know how to > > use VBA... > > > So if someone could help me out with this, i would be very greatful ! > > thank you in advance > > Best > > CG > > Chris, > > I am not sure if I understand your wanted solution. You say in your real data you have variables d0-d125 that you want to sum up and then append to consecutive observations by id_bb. That will only work if you exactly 126 observations for each ID. If that is what you have then the following code will work. Change the indices of 3 to 125 for your data. > > data have; > infile cards; > input id_bb d0 d1 d2 d3; > cards; > 1 10 0 0 0 > 1 8 5 0 0 > 1 6 10 6 0 > 1 4 7 4 3 > 2 8 0 0 0 > 2 4 9 0 0 > 2 7 8 5 0 > 2 10 3 10 6 > ; > run; > data want; > do until(last.id_bb); > set have; > by id_bb; > array d_a[0:3] d0-d3; > array sum_a[0:3] s0-s3; > > do i=0 to 3; > sum_a[i] = sum(0,sum_a[i],d_a[i]); > end; > end; > do i=0 to 3; > set have; > x = sum_a[i]; > output; > end; > run; > proc print; > run; > > I you don't have 126 observations per ID then you will need to write back to SAS-L and give us more detail on exactly what you want. Hope this is helpful, > > Dan > > Daniel J. Nordlund > Washington State Department of Social and Health Services > Planning, Performance, and Accountability > Research and Data Analysis Division > Olympia, WA 98504-5204 Hello everybody thanks a lot for your help ! unfortunately, the real data may have less than 126 obs. per id... but the code by Mike Zdeb works perfectly, so i found what i was looking for thanks to all of you and thank you to Mike for his code Best CG |