Prev: 1)if and if-else 2)if and where 3)keep=option and keep statement4)input function and input statement 5)scan and substr6)if and if then do7)informat and format i had good knowledge about those difference with examples.
Next: Spectral Analysis
From: NordlDJ on 22 Feb 2010 17:11 > -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of > ChrisG > Sent: Monday, February 22, 2010 12:42 PM > To: SAS-L(a)LISTSERV.UGA.EDU > Subject: Re: tricky sum > > On 22 f�v, 20:47, Nord...(a)DSHS.WA.GOV ("Nordlund, Dan (DSHS/RDA)") > wrote: > > > -----Original Message----- > > > From: SAS(r) Discussion [mailto:SA...(a)LISTSERV.UGA.EDU] On Behalf Of > > > ChrisG > > > Sent: Monday, February 22, 2010 8:07 AM > > > To: SA...(a)LISTSERV.UGA.EDU > > > Subject: tricky sum > > > > > Hi everyone > > > > > Here is my tricky sum problem > > > I have such a dataset > > > > > id_bb d0 d1 d2 d3 > > > 1 10 0 0 0 > > > 1 8 5 0 0 > > > 1 6 10 6 0 > > > 1 4 7 4 3 > > > 2 8 0 0 0 > > > 2 4 9 0 0 > > > 2 7 8 5 0 > > > 2 10 3 10 6 > > > > > id_bb = identifier, then some numeric variables d0-d3 > > > in reality i have more than 50000 lines, more than 3000 unique > > > identifiers and it goes from d0 until d125 ! > > > > > now my dream is to have this: > > > > > id_bb d0 d1 d2 d3 x > > > 1 10 0 0 0 28 > > > 1 8 5 0 0 22 > > > 1 6 10 6 0 10 > > > 1 4 7 4 3 3 > > > 2 8 0 0 0 29 > > > 2 4 9 0 0 20 > > > 2 7 8 5 0 15 > > > 2 10 3 10 6 6 > > > > > this x variable is the trick sum i need > > > and it goes like this: > > > for the first id_bb: > > > x in the first line = sum(d0), i.e. 10+8+6+4 > > > x in the second line = sum(d1), i.e. 0+5+10+7 > > > so on until the last id_bb=1, i.e. x = sum(d3) = 0+0+0+3 > > > then again for the second id_bb: > > > x in the first line = sum(d0), ie. x = 8+4+7+10 > > > and so on until the last variable, ie. d3 > > > > > and at the end, in reality, this should happen for every id_bb (more > > > than 3000 unique id_bb) and for the d0 until d125... > > > > > i have no clue how to do it in SAS... > > > in Excel this will surely be a pain as well and i don't know how to > > > use VBA... > > > > > So if someone could help me out with this, i would be very greatful ! > > > thank you in advance > > > Best > > > CG > > > > Chris, > > > > I am not sure if I understand your wanted solution. You say in your real data you > have variables d0-d125 that you want to sum up and then append to consecutive > observations by id_bb. That will only work if you exactly 126 observations for each > ID. If that is what you have then the following code will work. Change the indices of > 3 to 125 for your data. > > > > data have; > > infile cards; > > input id_bb d0 d1 d2 d3; > > cards; > > 1 10 0 0 0 > > 1 8 5 0 0 > > 1 6 10 6 0 > > 1 4 7 4 3 > > 2 8 0 0 0 > > 2 4 9 0 0 > > 2 7 8 5 0 > > 2 10 3 10 6 > > ; > > run; > > data want; > > do until(last.id_bb); > > set have; > > by id_bb; > > array d_a[0:3] d0-d3; > > array sum_a[0:3] s0-s3; > > > > do i=0 to 3; > > sum_a[i] = sum(0,sum_a[i],d_a[i]); > > end; > > end; > > do i=0 to 3; > > set have; > > x = sum_a[i]; > > output; > > end; > > run; > > proc print; > > run; > > > > I you don't have 126 observations per ID then you will need to write back to SAS- > L and give us more detail on exactly what you want. Hope this is helpful, > > > > Dan > > > > Daniel J. Nordlund > > Washington State Department of Social and Health Services > > Planning, Performance, and Accountability > > Research and Data Analysis Division > > Olympia, WA 98504-5204 > > Hello everybody > > thanks a lot for your help ! > unfortunately, the real data may have less than 126 obs. per id... > but the code by Mike Zdeb works perfectly, so i found what i was > looking for > thanks to all of you > and thank you to Mike for his code > > Best > CG Chris, Maybe I am missing something, but if you don't have the same number of observations as you have variables (d0 to d_whatever), then I don't see how Mikes solution works. Just add one more variable to your list and see what happens. data x; input id_bb d0-d4; datalines; 1 10 0 0 0 0 1 8 5 0 0 0 1 6 10 6 0 0 1 4 7 4 3 1 2 8 0 0 0 0 2 4 9 0 0 0 2 7 8 5 0 0 2 10 3 10 6 2 ; run; proc summary data=x nway; var d0-d4; class id_bb; output out=stats (drop=_:) sum=; run; proc transpose data=stats out=tstats (keep=col1 rename=(col1=x)); by id_bb; run; options mergenoby=warn; data x; merge x tstats; run; options mergenoby=error; proc print data=x; run; You can't possibly want that final file, can you? Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204
From: ChrisG on 23 Feb 2010 03:27 On 22 fév, 23:11, Nord...(a)DSHS.WA.GOV ("Nordlund, Dan (DSHS/RDA)") wrote: > > -----Original Message----- > > From: SAS(r) Discussion [mailto:SA...(a)LISTSERV.UGA.EDU] On Behalf Of > > ChrisG > > Sent: Monday, February 22, 2010 12:42 PM > > To: SA...(a)LISTSERV.UGA.EDU > > Subject: Re: tricky sum > > > On 22 fév, 20:47, Nord...(a)DSHS.WA.GOV ("Nordlund, Dan (DSHS/RDA)") > > wrote: > > > > -----Original Message----- > > > > From: SAS(r) Discussion [mailto:SA...(a)LISTSERV.UGA.EDU] On Behalf Of > > > > ChrisG > > > > Sent: Monday, February 22, 2010 8:07 AM > > > > To: SA...(a)LISTSERV.UGA.EDU > > > > Subject: tricky sum > > > > > Hi everyone > > > > > Here is my tricky sum problem > > > > I have such a dataset > > > > > id_bb d0 d1 d2 d3 > > > > 1 10 0 0 0 > > > > 1 8 5 0 0 > > > > 1 6 10 6 0 > > > > 1 4 7 4 3 > > > > 2 8 0 0 0 > > > > 2 4 9 0 0 > > > > 2 7 8 5 0 > > > > 2 10 3 10 6 > > > > > id_bb = identifier, then some numeric variables d0-d3 > > > > in reality i have more than 50000 lines, more than 3000 unique > > > > identifiers and it goes from d0 until d125 ! > > > > > now my dream is to have this: > > > > > id_bb d0 d1 d2 d3 x > > > > 1 10 0 0 0 28 > > > > 1 8 5 0 0 22 > > > > 1 6 10 6 0 10 > > > > 1 4 7 4 3 3 > > > > 2 8 0 0 0 29 > > > > 2 4 9 0 0 20 > > > > 2 7 8 5 0 15 > > > > 2 10 3 10 6 6 > > > > > this x variable is the trick sum i need > > > > and it goes like this: > > > > for the first id_bb: > > > > x in the first line = sum(d0), i.e. 10+8+6+4 > > > > x in the second line = sum(d1), i.e. 0+5+10+7 > > > > so on until the last id_bb=1, i.e. x = sum(d3) = 0+0+0+3 > > > > then again for the second id_bb: > > > > x in the first line = sum(d0), ie. x = 8+4+7+10 > > > > and so on until the last variable, ie. d3 > > > > > and at the end, in reality, this should happen for every id_bb (more > > > > than 3000 unique id_bb) and for the d0 until d125... > > > > > i have no clue how to do it in SAS... > > > > in Excel this will surely be a pain as well and i don't know how to > > > > use VBA... > > > > > So if someone could help me out with this, i would be very greatful ! > > > > thank you in advance > > > > Best > > > > CG > > > > Chris, > > > > I am not sure if I understand your wanted solution. You say in your real data you > > have variables d0-d125 that you want to sum up and then append to consecutive > > observations by id_bb. That will only work if you exactly 126 observations for each > > ID. If that is what you have then the following code will work. Change the indices of > > 3 to 125 for your data. > > > > data have; > > > infile cards; > > > input id_bb d0 d1 d2 d3; > > > cards; > > > 1 10 0 0 0 > > > 1 8 5 0 0 > > > 1 6 10 6 0 > > > 1 4 7 4 3 > > > 2 8 0 0 0 > > > 2 4 9 0 0 > > > 2 7 8 5 0 > > > 2 10 3 10 6 > > > ; > > > run; > > > data want; > > > do until(last.id_bb); > > > set have; > > > by id_bb; > > > array d_a[0:3] d0-d3; > > > array sum_a[0:3] s0-s3; > > > > do i=0 to 3; > > > sum_a[i] = sum(0,sum_a[i],d_a[i]); > > > end; > > > end; > > > do i=0 to 3; > > > set have; > > > x = sum_a[i]; > > > output; > > > end; > > > run; > > > proc print; > > > run; > > > > I you don't have 126 observations per ID then you will need to write back to SAS- > > L and give us more detail on exactly what you want. Hope this is helpful, > > > > Dan > > > > Daniel J. Nordlund > > > Washington State Department of Social and Health Services > > > Planning, Performance, and Accountability > > > Research and Data Analysis Division > > > Olympia, WA 98504-5204 > > > Hello everybody > > > thanks a lot for your help ! > > unfortunately, the real data may have less than 126 obs. per id... > > but the code by Mike Zdeb works perfectly, so i found what i was > > looking for > > thanks to all of you > > and thank you to Mike for his code > > > Best > > CG > > Chris, > > Maybe I am missing something, but if you don't have the same number of observations as you have variables (d0 to d_whatever), then I don't see how Mikes solution works. Just add one more variable to your list and see what happens. > > data x; > input id_bb d0-d4; > datalines; > 1 10 0 0 0 0 > 1 8 5 0 0 0 > 1 6 10 6 0 0 > 1 4 7 4 3 1 > 2 8 0 0 0 0 > 2 4 9 0 0 0 > 2 7 8 5 0 0 > 2 10 3 10 6 2 > ; > run; > > proc summary data=x nway; > var d0-d4; > class id_bb; > output out=stats (drop=_:) sum=; > run; > > proc transpose data=stats out=tstats (keep=col1 rename=(col1=x)); > by id_bb; > run; > > options mergenoby=warn; > > data x; > merge x tstats; > run; > > options mergenoby=error; > > proc print data=x; > run; > > You can't possibly want that final file, can you? > > Dan > > Daniel J. Nordlund > Washington State Department of Social and Health Services > Planning, Performance, and Accountability > Research and Data Analysis Division > Olympia, WA 98504-5204 Hi Daniel ok i will check again the final output on the full sample it is true that your example gives a "bad" solution for me BRB Best CG
From: ChrisG on 23 Feb 2010 03:32 On 22 fév, 23:11, Nord...(a)DSHS.WA.GOV ("Nordlund, Dan (DSHS/RDA)") wrote: > > -----Original Message----- > > From: SAS(r) Discussion [mailto:SA...(a)LISTSERV.UGA.EDU] On Behalf Of > > ChrisG > > Sent: Monday, February 22, 2010 12:42 PM > > To: SA...(a)LISTSERV.UGA.EDU > > Subject: Re: tricky sum > > > On 22 fév, 20:47, Nord...(a)DSHS.WA.GOV ("Nordlund, Dan (DSHS/RDA)") > > wrote: > > > > -----Original Message----- > > > > From: SAS(r) Discussion [mailto:SA...(a)LISTSERV.UGA.EDU] On Behalf Of > > > > ChrisG > > > > Sent: Monday, February 22, 2010 8:07 AM > > > > To: SA...(a)LISTSERV.UGA.EDU > > > > Subject: tricky sum > > > > > Hi everyone > > > > > Here is my tricky sum problem > > > > I have such a dataset > > > > > id_bb d0 d1 d2 d3 > > > > 1 10 0 0 0 > > > > 1 8 5 0 0 > > > > 1 6 10 6 0 > > > > 1 4 7 4 3 > > > > 2 8 0 0 0 > > > > 2 4 9 0 0 > > > > 2 7 8 5 0 > > > > 2 10 3 10 6 > > > > > id_bb = identifier, then some numeric variables d0-d3 > > > > in reality i have more than 50000 lines, more than 3000 unique > > > > identifiers and it goes from d0 until d125 ! > > > > > now my dream is to have this: > > > > > id_bb d0 d1 d2 d3 x > > > > 1 10 0 0 0 28 > > > > 1 8 5 0 0 22 > > > > 1 6 10 6 0 10 > > > > 1 4 7 4 3 3 > > > > 2 8 0 0 0 29 > > > > 2 4 9 0 0 20 > > > > 2 7 8 5 0 15 > > > > 2 10 3 10 6 6 > > > > > this x variable is the trick sum i need > > > > and it goes like this: > > > > for the first id_bb: > > > > x in the first line = sum(d0), i.e. 10+8+6+4 > > > > x in the second line = sum(d1), i.e. 0+5+10+7 > > > > so on until the last id_bb=1, i.e. x = sum(d3) = 0+0+0+3 > > > > then again for the second id_bb: > > > > x in the first line = sum(d0), ie. x = 8+4+7+10 > > > > and so on until the last variable, ie. d3 > > > > > and at the end, in reality, this should happen for every id_bb (more > > > > than 3000 unique id_bb) and for the d0 until d125... > > > > > i have no clue how to do it in SAS... > > > > in Excel this will surely be a pain as well and i don't know how to > > > > use VBA... > > > > > So if someone could help me out with this, i would be very greatful ! > > > > thank you in advance > > > > Best > > > > CG > > > > Chris, > > > > I am not sure if I understand your wanted solution. You say in your real data you > > have variables d0-d125 that you want to sum up and then append to consecutive > > observations by id_bb. That will only work if you exactly 126 observations for each > > ID. If that is what you have then the following code will work. Change the indices of > > 3 to 125 for your data. > > > > data have; > > > infile cards; > > > input id_bb d0 d1 d2 d3; > > > cards; > > > 1 10 0 0 0 > > > 1 8 5 0 0 > > > 1 6 10 6 0 > > > 1 4 7 4 3 > > > 2 8 0 0 0 > > > 2 4 9 0 0 > > > 2 7 8 5 0 > > > 2 10 3 10 6 > > > ; > > > run; > > > data want; > > > do until(last.id_bb); > > > set have; > > > by id_bb; > > > array d_a[0:3] d0-d3; > > > array sum_a[0:3] s0-s3; > > > > do i=0 to 3; > > > sum_a[i] = sum(0,sum_a[i],d_a[i]); > > > end; > > > end; > > > do i=0 to 3; > > > set have; > > > x = sum_a[i]; > > > output; > > > end; > > > run; > > > proc print; > > > run; > > > > I you don't have 126 observations per ID then you will need to write back to SAS- > > L and give us more detail on exactly what you want. Hope this is helpful, > > > > Dan > > > > Daniel J. Nordlund > > > Washington State Department of Social and Health Services > > > Planning, Performance, and Accountability > > > Research and Data Analysis Division > > > Olympia, WA 98504-5204 > > > Hello everybody > > > thanks a lot for your help ! > > unfortunately, the real data may have less than 126 obs. per id... > > but the code by Mike Zdeb works perfectly, so i found what i was > > looking for > > thanks to all of you > > and thank you to Mike for his code > > > Best > > CG > > Chris, > > Maybe I am missing something, but if you don't have the same number of observations as you have variables (d0 to d_whatever), then I don't see how Mikes solution works. Just add one more variable to your list and see what happens. > > data x; > input id_bb d0-d4; > datalines; > 1 10 0 0 0 0 > 1 8 5 0 0 0 > 1 6 10 6 0 0 > 1 4 7 4 3 1 > 2 8 0 0 0 0 > 2 4 9 0 0 0 > 2 7 8 5 0 0 > 2 10 3 10 6 2 > ; > run; > > proc summary data=x nway; > var d0-d4; > class id_bb; > output out=stats (drop=_:) sum=; > run; > > proc transpose data=stats out=tstats (keep=col1 rename=(col1=x)); > by id_bb; > run; > > options mergenoby=warn; > > data x; > merge x tstats; > run; > > options mergenoby=error; > > proc print data=x; > run; > > You can't possibly want that final file, can you? > > Dan > > Daniel J. Nordlund > Washington State Department of Social and Health Services > Planning, Performance, and Accountability > Research and Data Analysis Division > Olympia, WA 98504-5204 Effectively Daniel you are right i got excited to fast by looking at the (first lines) of final dataset now i have checked the intermediate steps and something weird is happening into stats and tstats as well... so my problem is still not fixed ! gosh... does anyone has a solution ? thank you Daniel for spoting the problem ! Best CG
From: ChrisG on 23 Feb 2010 03:42 On 23 fév, 09:32, ChrisG <chris.godlew...(a)gmail.com> wrote: > On 22 fév, 23:11, Nord...(a)DSHS.WA.GOV ("Nordlund, Dan (DSHS/RDA)") > wrote: > > > > > > -----Original Message----- > > > From: SAS(r) Discussion [mailto:SA...(a)LISTSERV.UGA.EDU] On Behalf Of > > > ChrisG > > > Sent: Monday, February 22, 2010 12:42 PM > > > To: SA...(a)LISTSERV.UGA.EDU > > > Subject: Re: tricky sum > > > > On 22 fév, 20:47, Nord...(a)DSHS.WA.GOV ("Nordlund, Dan (DSHS/RDA)") > > > wrote: > > > > > -----Original Message----- > > > > > From: SAS(r) Discussion [mailto:SA...(a)LISTSERV.UGA.EDU] On Behalf Of > > > > > ChrisG > > > > > Sent: Monday, February 22, 2010 8:07 AM > > > > > To: SA...(a)LISTSERV.UGA.EDU > > > > > Subject: tricky sum > > > > > > Hi everyone > > > > > > Here is my tricky sum problem > > > > > I have such a dataset > > > > > > id_bb d0 d1 d2 d3 > > > > > 1 10 0 0 0 > > > > > 1 8 5 0 0 > > > > > 1 6 10 6 0 > > > > > 1 4 7 4 3 > > > > > 2 8 0 0 0 > > > > > 2 4 9 0 0 > > > > > 2 7 8 5 0 > > > > > 2 10 3 10 6 > > > > > > id_bb = identifier, then some numeric variables d0-d3 > > > > > in reality i have more than 50000 lines, more than 3000 unique > > > > > identifiers and it goes from d0 until d125 ! > > > > > > now my dream is to have this: > > > > > > id_bb d0 d1 d2 d3 x > > > > > 1 10 0 0 0 28 > > > > > 1 8 5 0 0 22 > > > > > 1 6 10 6 0 10 > > > > > 1 4 7 4 3 3 > > > > > 2 8 0 0 0 29 > > > > > 2 4 9 0 0 20 > > > > > 2 7 8 5 0 15 > > > > > 2 10 3 10 6 6 > > > > > > this x variable is the trick sum i need > > > > > and it goes like this: > > > > > for the first id_bb: > > > > > x in the first line = sum(d0), i.e. 10+8+6+4 > > > > > x in the second line = sum(d1), i.e. 0+5+10+7 > > > > > so on until the last id_bb=1, i.e. x = sum(d3) = 0+0+0+3 > > > > > then again for the second id_bb: > > > > > x in the first line = sum(d0), ie. x = 8+4+7+10 > > > > > and so on until the last variable, ie. d3 > > > > > > and at the end, in reality, this should happen for every id_bb (more > > > > > than 3000 unique id_bb) and for the d0 until d125... > > > > > > i have no clue how to do it in SAS... > > > > > in Excel this will surely be a pain as well and i don't know how to > > > > > use VBA... > > > > > > So if someone could help me out with this, i would be very greatful ! > > > > > thank you in advance > > > > > Best > > > > > CG > > > > > Chris, > > > > > I am not sure if I understand your wanted solution. You say in your real data you > > > have variables d0-d125 that you want to sum up and then append to consecutive > > > observations by id_bb. That will only work if you exactly 126 observations for each > > > ID. If that is what you have then the following code will work. Change the indices of > > > 3 to 125 for your data. > > > > > data have; > > > > infile cards; > > > > input id_bb d0 d1 d2 d3; > > > > cards; > > > > 1 10 0 0 0 > > > > 1 8 5 0 0 > > > > 1 6 10 6 0 > > > > 1 4 7 4 3 > > > > 2 8 0 0 0 > > > > 2 4 9 0 0 > > > > 2 7 8 5 0 > > > > 2 10 3 10 6 > > > > ; > > > > run; > > > > data want; > > > > do until(last.id_bb); > > > > set have; > > > > by id_bb; > > > > array d_a[0:3] d0-d3; > > > > array sum_a[0:3] s0-s3; > > > > > do i=0 to 3; > > > > sum_a[i] = sum(0,sum_a[i],d_a[i]); > > > > end; > > > > end; > > > > do i=0 to 3; > > > > set have; > > > > x = sum_a[i]; > > > > output; > > > > end; > > > > run; > > > > proc print; > > > > run; > > > > > I you don't have 126 observations per ID then you will need to write back to SAS- > > > L and give us more detail on exactly what you want. Hope this is helpful, > > > > > Dan > > > > > Daniel J. Nordlund > > > > Washington State Department of Social and Health Services > > > > Planning, Performance, and Accountability > > > > Research and Data Analysis Division > > > > Olympia, WA 98504-5204 > > > > Hello everybody > > > > thanks a lot for your help ! > > > unfortunately, the real data may have less than 126 obs. per id... > > > but the code by Mike Zdeb works perfectly, so i found what i was > > > looking for > > > thanks to all of you > > > and thank you to Mike for his code > > > > Best > > > CG > > > Chris, > > > Maybe I am missing something, but if you don't have the same number of observations as you have variables (d0 to d_whatever), then I don't see how Mikes solution works. Just add one more variable to your list and see what happens. > > > data x; > > input id_bb d0-d4; > > datalines; > > 1 10 0 0 0 0 > > 1 8 5 0 0 0 > > 1 6 10 6 0 0 > > 1 4 7 4 3 1 > > 2 8 0 0 0 0 > > 2 4 9 0 0 0 > > 2 7 8 5 0 0 > > 2 10 3 10 6 2 > > ; > > run; > > > proc summary data=x nway; > > var d0-d4; > > class id_bb; > > output out=stats (drop=_:) sum=; > > run; > > > proc transpose data=stats out=tstats (keep=col1 rename=(col1=x)); > > by id_bb; > > run; > > > options mergenoby=warn; > > > data x; > > merge x tstats; > > run; > > > options mergenoby=error; > > > proc print data=x; > > run; > > > You can't possibly want that final file, can you? > > > Dan > > > Daniel J. Nordlund > > Washington State Department of Social and Health Services > > Planning, Performance, and Accountability > > Research and Data Analysis Division > > Olympia, WA 98504-5204 > > Effectively Daniel you are right > i got excited to fast by looking at the (first lines) of final dataset > now i have checked the intermediate steps and something weird is > happening into stats and tstats as well... > so my problem is still not fixed ! > gosh... > does anyone has a solution ? > > thank you Daniel for spoting the problem ! > > Best > CG Daniel I did something like this on the example : proc transpose data=stats out=tstats ; by id_bb; run; data tstats; set tstats; d=_name_; x=col1; drop _name_ col1; run; data x; merge x tstats; by id_bb; run; and the final sample looks fine... what do you think ? best CG
From: Jack Clark on 23 Feb 2010 09:29
Chris, I understand that your data may have fewer than 126 observations per ID. From your description below, it sounds like the number of observations for a particular ID will be the number of numeric variables (D0-D125) which need to be summed. Is this correct? Will the numeric variables always be populated starting with D0 and going up toward D125? For example, when an ID has 8 observations does this mean that D0 - D07 will always be the variables which need to be summed? Or could the variables needing to be summed be spread out across the D0 - D125 range? If the range of variables to be summed always starts with D0, D1, D2, etc. and contains a number of numeric variables to be summed equivalent to the number of observations for that ID, a slight modification to my solution from yesterday may work. The main logic modification is in the final merge, where a subsetting IF statement is added to only keep observations from the original data. I have added an additional ID (which does not contain 4 observations) to the sample data for testing. *(using your sample data); data have; infile cards missover; input id_bb d0 d1 d2 d3; cards; 1 10 0 0 0 1 8 5 0 0 1 6 10 6 0 1 4 7 4 3 2 8 0 0 0 2 4 9 0 0 2 7 8 5 0 2 10 3 10 6 3 4 1 0 0 3 2 8 0 0 ; run; proc print data = have; run; * create SUM_VAR containing variable name of summed variable ; data next (drop=arr_cnt); set have; by id_bb; array dvars(*) d0-d3; if first.id_bb then arr_cnt = 1; sum_var = vname(dvars(arr_cnt)); arr_cnt + 1; run; proc print data = next; run; * sum numeric fields by ID_BB ; proc summary data = have nway missing; class id_bb; var d0-d3; output out = have_sum (drop=_type_ _freq_) sum=; run; proc print data = have_sum; run; * transpose summed data - rename fields ; proc transpose data = have_sum out = have_sumt (rename=(_name_=sum_var col1=x)); by id_bb; run; proc print data = have_sumt; run; * merge summed data to original data ; data need; merge next (in=a) have_sumt; by id_bb sum_var; if a; run; proc print data = need; run; Jack Jack Clark Senior Research Analyst phone: 410-455-6256 fax: 410-455-6850 jclark(a)hilltop.umbc.edu ________________________________ From: Christophe Godlewski [mailto:chris.godlewski(a)gmail.com] Sent: Monday, February 22, 2010 2:46 PM To: Jack Clark Cc: SAS-L(a)listserv.uga.edu Subject: Re: tricky sum Hello Jack Thank you very much for your help it works fine with the data for the example unfortunately, in the full dataset, each id_bb can have less than 126 observations... it can be 1, 3, 89 or 126... 126 is the maximum actually more precisely, the "matrix" has 126 terms for each id_bb but for instance if an id_bb has only 89 observations, then for d89 to d125, there will be 0 in the cells... now i am thinking of a trick to make your code applicable to my data... but if you have a smart idea im taking it ! thank you again for your help best CG On Mon, Feb 22, 2010 at 8:18 PM, Jack Clark <jclark(a)hilltop.umbc.edu> wrote: Chris, If I understand the question correctly, this means that each ID_BB value in your data has exactly 126 observations? This is because you will need 126 observations per ID to hold the sums of D0-D125. Is that correct? This solution includes the creation of a new variable called SUM_VAR which has the value of the variable name being summed as the value of X. You can drop it at the end if you don't want it. The data are summarized by ID_BB to get the sum of each numeric variable. That summarized data is transposed and then merged to the original data by ID_BB and SUM_VAR. Let me know if this meets your needs. *(using your sample data); data have; infile cards missover; input id_bb d0 d1 d2 d3; cards; 1 10 0 0 0 1 8 5 0 0 1 6 10 6 0 1 4 7 4 3 2 8 0 0 0 2 4 9 0 0 2 7 8 5 0 2 10 3 10 6 ; run; * create SUM_VAR containing variable name of summed variable ; data next (drop=arr_cnt); set have; by id_bb; array dvars(*) d0-d3; if first.id_bb then arr_cnt = 1; sum_var = vname(dvars(arr_cnt)); arr_cnt + 1; run; proc print data = next; run; * sum numeric fields by ID_BB ; proc summary data = have nway missing; class id_bb; var d0-d3; output out = have_sum (drop=_type_ _freq_) sum=; run; proc print data = have_sum; run; * transpose summed data - rename fields ; proc transpose data = have_sum out = have_sumt (rename=(_name_=sum_var col1=x)); by id_bb; run; proc print data = have_sumt; run; * merge summed data to original data ; data need; merge next have_sumt; by id_bb sum_var; run; proc print data = need; run; Jack Jack Clark Senior Research Analyst phone: 410-455-6256 fax: 410-455-6850 jclark(a)hilltop.umbc.edu University of Maryland, Baltimore County Sondheim Hall, 3rd Floor 1000 Hilltop Circle Baltimore, MD 21250 Confidentiality Notice: This e-mail may contain information that is legally privileged and that is intended only for the use of the addressee(s) named above. If you are not the intended recipient, you are hereby notified that any disclosure, copying of this e-mail, distribution, or action taken in reliance on the contents of this e-mail and/or documents attributed to this e-mail is strictly prohibited. If you have received this information in error, please notify the sender immediately by phone and delete this entire e-mail. Thank you.-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of ChrisG Sent: Monday, February 22, 2010 11:07 AM To: SAS-L(a)LISTSERV.UGA.EDU Subject: tricky sum Hi everyone Here is my tricky sum problem I have such a dataset id_bb d0 d1 d2 d3 1 10 0 0 0 1 8 5 0 0 1 6 10 6 0 1 4 7 4 3 2 8 0 0 0 2 4 9 0 0 2 7 8 5 0 2 10 3 10 6 id_bb = identifier, then some numeric variables d0-d3 in reality i have more than 50000 lines, more than 3000 unique identifiers and it goes from d0 until d125 ! now my dream is to have this: id_bb d0 d1 d2 d3 x 1 10 0 0 0 28 1 8 5 0 0 22 1 6 10 6 0 10 1 4 7 4 3 3 2 8 0 0 0 29 2 4 9 0 0 20 2 7 8 5 0 15 2 10 3 10 6 6 this x variable is the trick sum i need and it goes like this: for the first id_bb: x in the first line = sum(d0), i.e. 10+8+6+4 x in the second line = sum(d1), i.e. 0+5+10+7 so on until the last id_bb=1, i.e. x = sum(d3) = 0+0+0+3 then again for the second id_bb: x in the first line = sum(d0), ie. x = 8+4+7+10 and so on until the last variable, ie. d3 and at the end, in reality, this should happen for every id_bb (more than 3000 unique id_bb) and for the d0 until d125... i have no clue how to do it in SAS... in Excel this will surely be a pain as well and i don't know how to use VBA... So if someone could help me out with this, i would be very greatful ! thank you in advance Best CG -- Christophe J. GODLEWSKI Strasbourg, France |