Divide the data set into four groups [SAS]

Prev: Divide the data set into four groups
Next: Retain statement code wont work for variable name containing an underscore?

From: sasbegy on 18 May 2010 11:38

I have a field with calculated shares across say 38k observations. I
need to adapt some method and split the shares into four groups say
High Med low and none. If the share is greater then that must be
classified as high.

How do I divide my whole dataset on this particular variable and make
4 distinct groups?

Please let me know. Any small help is greatly appreciated.

From: Sid on 18 May 2010 12:16

On May 18, 10:38 am, sasbegy <pattukutt...(a)gmail.com> wrote:
> I have a field with calculated shares across say 38k observations. I
> need to adapt some method and split the shares into four groups say
> High Med low and none. If the share is greater then that must be
> classified as high.
>
> How do I divide my whole dataset on this particular variable and make
> 4 distinct groups?
>
> Please let me know. Any small help is greatly appreciated.

Since you have not provided a sample dataset to work with, below code
uses a random sample of shares that is created in the first data step.
I used PROC RANK to split the sample.
/* creating a dataset with random share values */
data s1;
do id = 1 to 38000;
share = ceil(ranuni(0) * 10000);
output;
end;
run;
/* assigning ranks to these shares */
proc rank data = s1 out = s2 ties = low groups = 4;
var share;
ranks share_rank;
run;
/* classifying shares into 4 types */
data s3;
set s2;
if share_rank = 0 then type = 'None';
else if share_rank < 2 then type = 'Low';
else if share_rank < 3 then type = 'Med';
else type = 'High';
run;

HTH,
Sid

From: Reeza on 18 May 2010 12:37

On May 18, 8:38 am, sasbegy <pattukutt...(a)gmail.com> wrote:
> I have a field with calculated shares across say 38k observations. I
> need to adapt some method and split the shares into four groups say
> High Med low and none. If the share is greater then that must be
> classified as high.
>
> How do I divide my whole dataset on this particular variable and make
> 4 distinct groups?
>
> Please let me know. Any small help is greatly appreciated.

I highly recommend creating a histogram of the data you want to group
and see where natural boundaries are. Change the bins of the
histograms to see if groups are consistent.
If none are present then use another method.

From: sasbegy on 18 May 2010 13:31

Hi,

Thanks much for your help Sid and reeza. Could you please tell me moer
about how to create histograms and divide them ??

I have a full column of data with market shares calculated. Depending
on the shares say if its 100% then he's high if its 0 then its low. I
have values from 0 to 100. How to dived them into four gropus using
histogram?

Please let me know

On May 18, 12:37 pm, Reeza <fkhurs...(a)hotmail.com> wrote:
> On May 18, 8:38 am, sasbegy <pattukutt...(a)gmail.com> wrote:
>
> > I have a field with calculated shares across say 38k observations. I
> > need to adapt some method and split the shares into four groups say
> > High Med low and none. If the share is greater then that must be
> > classified as high.
>
> > How do I divide my whole dataset on this particular variable and make
> > 4 distinct groups?
>
> > Please let me know. Any small help is greatly appreciated.
>
> I highly recommend creating a histogram of the data you want to group
> and see where natural boundaries are. Change the bins of the
> histograms to see if groups are consistent.
> If none are present then use another method.

|
Pages: 1
Prev: Divide the data set into four groups
Next: Retain statement code wont work for variable name containing an underscore?