Prev: Divide the data set into four groups
Next: Retain statement code wont work for variable name containing an underscore?
From: sasbegy on 18 May 2010 11:38 I have a field with calculated shares across say 38k observations. I need to adapt some method and split the shares into four groups say High Med low and none. If the share is greater then that must be classified as high. How do I divide my whole dataset on this particular variable and make 4 distinct groups? Please let me know. Any small help is greatly appreciated.
From: Sid on 18 May 2010 12:16 On May 18, 10:38 am, sasbegy <pattukutt...(a)gmail.com> wrote: > I have a field with calculated shares across say 38k observations. I > need to adapt some method and split the shares into four groups say > High Med low and none. If the share is greater then that must be > classified as high. > > How do I divide my whole dataset on this particular variable and make > 4 distinct groups? > > Please let me know. Any small help is greatly appreciated. Since you have not provided a sample dataset to work with, below code uses a random sample of shares that is created in the first data step. I used PROC RANK to split the sample. /* creating a dataset with random share values */ data s1; do id = 1 to 38000; share = ceil(ranuni(0) * 10000); output; end; run; /* assigning ranks to these shares */ proc rank data = s1 out = s2 ties = low groups = 4; var share; ranks share_rank; run; /* classifying shares into 4 types */ data s3; set s2; if share_rank = 0 then type = 'None'; else if share_rank < 2 then type = 'Low'; else if share_rank < 3 then type = 'Med'; else type = 'High'; run; HTH, Sid
From: Reeza on 18 May 2010 12:37 On May 18, 8:38 am, sasbegy <pattukutt...(a)gmail.com> wrote: > I have a field with calculated shares across say 38k observations. I > need to adapt some method and split the shares into four groups say > High Med low and none. If the share is greater then that must be > classified as high. > > How do I divide my whole dataset on this particular variable and make > 4 distinct groups? > > Please let me know. Any small help is greatly appreciated. I highly recommend creating a histogram of the data you want to group and see where natural boundaries are. Change the bins of the histograms to see if groups are consistent. If none are present then use another method.
From: sasbegy on 18 May 2010 13:31
Hi, Thanks much for your help Sid and reeza. Could you please tell me moer about how to create histograms and divide them ?? I have a full column of data with market shares calculated. Depending on the shares say if its 100% then he's high if its 0 then its low. I have values from 0 to 100. How to dived them into four gropus using histogram? Please let me know On May 18, 12:37 pm, Reeza <fkhurs...(a)hotmail.com> wrote: > On May 18, 8:38 am, sasbegy <pattukutt...(a)gmail.com> wrote: > > > I have a field with calculated shares across say 38k observations. I > > need to adapt some method and split the shares into four groups say > > High Med low and none. If the share is greater then that must be > > classified as high. > > > How do I divide my whole dataset on this particular variable and make > > 4 distinct groups? > > > Please let me know. Any small help is greatly appreciated. > > I highly recommend creating a histogram of the data you want to group > and see where natural boundaries are. Change the bins of the > histograms to see if groups are consistent. > If none are present then use another method. |