Prev: Numeric ID Values with decimals -- probably not the best idea in
Next: nps74.tmp.pdf, nps75.tmp.pdf
From: Henry on 20 Nov 2009 19:31 Hello there, I have a data set looks like: F1 F2 V1 1 1 0.1 1 2 0.3 1 3 0.5 2 1 0.56 2 2 0.99 2 3 0.78 3 1 0.84 3 2 0.83 3 3 0.65 I will like to run the correlation on each pair of F1 group which means that for F1=1, I will run its v1 values with the values when F1=2 and also the values when F1=3. Then, I will also run the values for F1=2 with the values when F1=1 and also with the values when F1=3 Same for the values for F1=3 with values when F1=1 and with the values when F1=2; In brief, the data to be prepared for the proc corr analysis should be like: 1 2 0.3 2 1 0.56 1 3 0.5 2 3 0.78 1 2 0.3 3 1 0.84 1 3 0.5 3 2 0.83 2 1 0.56 1 2 0.3 2 3 0.78 1 3 0.5 2 1 0.56 3 1 0.84 2 3 0.78 3 2 0.83 3 1 0.84 1 2 0.3 3 2 0.83 1 3 0.5 3 1 0.84 2 1 0.56 3 2 0.83 2 3 0.78 I was wondering that is there an efficient way to do it? Any thoughts? Please let me know if possible. Thanks in advance!!! Henry
From: Tom Abernathy on 20 Nov 2009 21:38 Henry - You need to come up with more meaningful variable names. Did you try tranposing you data to make it more like a matrix? proc transpose data=orig out=x prefix=NEW; by f1; id f2; var v1; run; proc corr data=x; var NEW:; run; - Tom On Nov 20, 7:31 pm, Henry <chchanghe...(a)gmail.com> wrote: > Hello there, > > I have a data set looks like: > > F1 F2 V1 > > 1 1 0.1 > 1 2 0.3 > 1 3 0.5 > 2 1 0.56 > 2 2 0.99 > 2 3 0.78 > 3 1 0.84 > 3 2 0.83 > 3 3 0.65 > > I will like to run the correlation on each pair of F1 group which > means that for F1=1, I will run its v1 values with > the values when F1=2 and also the values when F1=3. > Then, I will also run the values for F1=2 with the values when F1=1 > and also with the values when F1=3 > Same for the values for F1=3 with values when F1=1 and with the values > when F1=2; > > In brief, the data to be prepared for the proc corr analysis should be > like: > > 1 2 0.3 2 1 0.56 > 1 3 0.5 2 3 0.78 > > 1 2 0.3 3 1 0.84 > 1 3 0.5 3 2 0.83 > > 2 1 0.56 1 2 0.3 > 2 3 0.78 1 3 0.5 > > 2 1 0.56 3 1 0.84 > 2 3 0.78 3 2 0.83 > > 3 1 0.84 1 2 0.3 > 3 2 0.83 1 3 0.5 > > 3 1 0.84 2 1 0.56 > 3 2 0.83 2 3 0.78 > > I was wondering that is there an efficient way to do it? Any > thoughts? Please let me know if possible. > Thanks in advance!!! > > Henry
From: Henry on 21 Nov 2009 01:29 Hello, I would like to make my data set like this. So, each time, I can just run a pair to get one correlation (but if F1 = F2 then I will delete it from the regression). Also, as I would like to put this correlation back to my original data (create new column to put this correlation.), so, I will need the F1 and F2 variables in order to merge back to the original data set. The first three columns just the first two observations from the original data (after dropping the observation when F1=1 = F2) The second three columns just got from the 4th and 6th observations from original data after dropping the observation when F1 = 2 = F2. so far and forth. FIRST PAIR when F1=1 (as I have three group I will run the correlation when F1=1 with F1=2) F1 F2 V1 F1 F2 V1 1 2 0.3 2 1 0.56 1 3 0.5 2 3 0.78 SECOND PAIR when F1=1 (the correlation for F1 =1 and F1 =3) F1 F2 V1 F1 F2 V1 1 2 0.3 3 1 0.84 1 3 0.5 3 2 0.83 THIRD PAIR when F1=2 (the correlation for F1=2 and F1 =2) F1 F2 V1 F1 F2 V1 2 1 0.56 1 2 0.3 2 3 0.78 1 3 0.5 FOURTH PAIR when F1=2 (the correlation for F1=2 and F1 =3) F1 F2 V1 F1 F2 V1 2 1 0.56 3 1 0.84 2 3 0.78 3 2 0.83 FIFTH PAIR when F1=3 (the correlation for F1=3 and F1 =1) F1 F2 V1 F1 F2 V1 3 1 0.84 1 2 0.3 3 2 0.83 1 3 0.5 SIXTH PAIR when F1=3 (the correlation for F1=3 and F1 =2) F1 F2 V1 F1 F2 V1 3 1 0.84 2 1 0.56 3 2 0.83 2 3 0.78 I tried the following codes: proc transpose data=orig out=x prefix=NEW; by f1; id f2; var v1; run; But, it didn't give the format I am looking for. So, I was wondering if there is way to do it? Thank you so much for helping me out. Henry
From: Tom Abernathy on 21 Nov 2009 10:59 Henry - I still do not understand. Are you trying to calculate a correlation? Or are you trying to merge in correlation values that you have calculated in some other way? To calculate a correlation you need two or more pairs of observed values. For example if you sampled the Height and Weight of 10 people you could calculate the correlation between height and weight. In that case you would have 10 observations and 3 rows (one for an ID variable to identify the person). You appear to be asking for the correlation of 0.3 with 0.56 and that just cannot be done. I still have no idea what F1 and F2 are. You talk of pairs and groups. Do the values of F1 and F2 represent the groups? So does F1=1 and F2=2 mean a comparison between Group #1 and Group #2. It so it would reduce confusion if you renamed F1 and F2 to FirstGroup and SecondGroup or something more meaningful. Also if you changed the values from numbers to characters so that you did not get confused and calculate the mean of the group numbers. But what is the meaning of the numbers stored in the variable V1? - Tom On Nov 21, 1:29 am, Henry <chchanghe...(a)gmail.com> wrote: > Hello, > > I would like to make my data set like this. So, each time, I can just > run a pair to get one correlation (but if F1 = F2 then I will delete > it from the regression). Also, as I would like to put this > correlation back to my original data (create new column to put this > correlation.), so, I will need the F1 and F2 variables in order to > merge back to the original data set. > > The first three columns just the first two observations from the > original data (after dropping the observation when F1=1 = F2) > The second three columns just got from the 4th and 6th observations > from original data after dropping the observation when F1 = 2 = F2. so > far and forth. > > FIRST PAIR when F1=1 (as I have three group I will run the > correlation when F1=1 with F1=2) > F1 F2 V1 F1 F2 V1 > 1 2 0.3 2 1 0.56 > 1 3 0.5 2 3 0.78 > > SECOND PAIR when F1=1 (the correlation for F1 =1 and F1 =3) > F1 F2 V1 F1 F2 V1 > 1 2 0.3 3 1 0.84 > 1 3 0.5 3 2 0.83 > > THIRD PAIR when F1=2 (the correlation for F1=2 and F1 =2) > F1 F2 V1 F1 F2 V1 > 2 1 0.56 1 2 0.3 > 2 3 0.78 1 3 0.5 > > FOURTH PAIR when F1=2 (the correlation for F1=2 and F1 =3) > F1 F2 V1 F1 F2 V1 > 2 1 0.56 3 1 0.84 > 2 3 0.78 3 2 0.83 > > FIFTH PAIR when F1=3 (the correlation for F1=3 and F1 =1) > F1 F2 V1 F1 F2 V1 > 3 1 0.84 1 2 0.3 > 3 2 0.83 1 3 0.5 > > SIXTH PAIR when F1=3 (the correlation for F1=3 and F1 =2) > F1 F2 V1 F1 F2 V1 > 3 1 0.84 2 1 0.56 > 3 2 0.83 2 3 0.78 > > I tried the following codes: > proc transpose data=orig out=x prefix=NEW; > by f1; > id f2; > var v1; > run; > > But, it didn't give the format I am looking for. So, I was wondering > if there is way to do it? Thank you so much for helping me out. > > Henry
From: Henry on 21 Nov 2009 13:07 Hi Tom, Thanks for your reply. I guess I know why it is confused. I have a 900000 unique firms I denoted as i. As each firm has a value with another firm, denoted as j(including itself). Then, I will have 900000*900000 observations. F1 here denoted i firm and F2 denoted as the corresponding firm j for i. That's why I wrote something like: F1 F2 V1 1 1 0.1 1 2 0.3 1 3 0.5 2 1 0.56 2 2 0.99 2 3 0.78 Now, I tried to get the correlation between these firms except themselves which is when F1 = F2. So, in brief, I want the correlation between these two columns 0.3 0.56 0.5 0.78 that's why I put : F1 F2 V1 F1 F2 V1 1 2 0.3 2 1 0.56 1 3 0.5 2 3 0.78 As I am thinking about reshaping the values so I can get a pair of values as you mentioned to calculate the correlation. then as firm1(when F1 = 1) will also need a correlation with firm 3 (when F1=3) So, I put: F1 F2 V1 F1 F2 V1 1 2 0.3 3 1 0.84 1 3 0.5 3 2 0.83 Meaning I need the correlation between 0.3 0.84 0.5 0.83 This will have to be done for this 900000 unique firms which has its another 900000 corresponding firms (including itself). Then, after obtaining the correlation between firm i and j, I will have to merge them back to the original data. So, I think I still need the value for F1 and F2 so I can do the merge. Of course, the F1 and F2 they are characteristic values (as they also contain a, b ,c within numbers. i.e. firm i's F1 = 200c00). I am not sure if you can understand my issues here? Thanks again for your reply! Henry
|
Next
|
Last
Pages: 1 2 Prev: Numeric ID Values with decimals -- probably not the best idea in Next: nps74.tmp.pdf, nps75.tmp.pdf |