From: Nancy on 11 Mar 2010 12:01 Hello All, I just found a problem when using the soundex function. It singed different values for the same names in my data sets. I am wondering whether there is somehing wrong with my opreation or somthing else. Thank you, Please see the examples: Obs First_name IDF 1 TAMARI T5623 2 TAMARI T56 3 DEVIN D151 4 DEVIN D15 5 JULIO J42 6 JULIO J4 7 NGOC N221 8 NGOC N22 9 TAMARI T5623 10 TAMARI T562
From: Lou on 11 Mar 2010 12:46 From the description of the function in the documentation, "TAMARI" should encode as T56 - if you're getting anything else, it would appear that you have a problem. But whether it's a problem with your installation or your code, we can't tell. It might be helpful if you posted an example of your code. "Nancy" <nancy0318(a)gmail.com> wrote in message news:c6d64752-83c8-4db6-972f-138850277d73(a)a18g2000yqc.googlegroups.com... > Hello All, > > I just found a problem when using the soundex function. It singed > different values for the same names in my data sets. I am wondering > whether there is somehing wrong with my opreation or somthing else. > > Thank you, > > Please see the examples: > > Obs First_name IDF > > 1 TAMARI T5623 > 2 TAMARI T56 > 3 DEVIN D151 > 4 DEVIN D15 > 5 JULIO J42 > 6 JULIO J4 > 7 NGOC N221 > 8 NGOC N22 > 9 TAMARI T5623 > 10 TAMARI T562 >
From: Nancy on 11 Mar 2010 13:58 I used the IDF=soundex(first_name) to get the soundex ID for the first name. Is there anything wrong? Thank you! On Mar 11, 12:46 pm, "Lou" <lpog...(a)hotmail.com> wrote: > From the description of the function in the documentation, "TAMARI" should > encode as T56 - if you're getting anything else, it would appear that you > have a problem. But whether it's a problem with your installation or your > code, we can't tell. It might be helpful if you posted an example of your > code. > > "Nancy" <nancy0...(a)gmail.com> wrote in message > > news:c6d64752-83c8-4db6-972f-138850277d73(a)a18g2000yqc.googlegroups.com... > > > > > Hello All, > > > I just found a problem when using the soundex function. It singed > > different values for the same names in my data sets. I am wondering > > whether there is somehing wrong with my opreation or somthing else. > > > Thank you, > > > Please see the examples: > > > Obs First_name IDF > > > 1 TAMARI T5623 > > 2 TAMARI T56 > > 3 DEVIN D151 > > 4 DEVIN D15 > > 5 JULIO J42 > > 6 JULIO J4 > > 7 NGOC N221 > > 8 NGOC N22 > > 9 TAMARI T5623 > > 10 TAMARI T562- Hide quoted text - > > - Show quoted text -
From: data _null_; on 11 Mar 2010 14:16 On Mar 11, 12:58 pm, Nancy <nancy0...(a)gmail.com> wrote: > I used the > > IDF=soundex(first_name) > > to get the soundex ID for the first name. > > Is there anything wrong? > > Thank you! > > On Mar 11, 12:46 pm, "Lou" <lpog...(a)hotmail.com> wrote: > > > > > From the description of the function in the documentation, "TAMARI" should > > encode as T56 - if you're getting anything else, it would appear that you > > have a problem. But whether it's a problem with your installation or your > > code, we can't tell. It might be helpful if you posted an example of your > > code. > > > "Nancy" <nancy0...(a)gmail.com> wrote in message > > >news:c6d64752-83c8-4db6-972f-138850277d73(a)a18g2000yqc.googlegroups.com.... > > > > Hello All, > > > > I just found a problem when using the soundex function. It singed > > > different values for the same names in my data sets. I am wondering > > > whether there is somehing wrong with my opreation or somthing else. > > > > Thank you, > > > > Please see the examples: > > > > Obs First_name IDF > > > > 1 TAMARI T5623 > > > 2 TAMARI T56 > > > 3 DEVIN D151 > > > 4 DEVIN D15 > > > 5 JULIO J42 > > > 6 JULIO J4 > > > 7 NGOC N221 > > > 8 NGOC N22 > > > 9 TAMARI T5623 > > > 10 TAMARI T562- Hide quoted text - > > > - Show quoted text -- Hide quoted text - > > - Show quoted text - I ran the data you posted and got the same soundex values for each pair. So I don't think the problem is SOUNDEX. But what could it be? Different soundex values imply that some of the words were longer but you see the names as being equal. I can think of one way that could happen, I'm sure others can think of otherways. Perhaps the NAMES are formatted with a format that does not display the entire value. As in this example. data test; input First_name $16. IDF $; s = soundex(first_name); s2 = soundex(scan(first_name,1,' ')); format First_name $6.; Name = First_name; cards; TAMARI J T5623 TAMARI T56 DEVIN S D151 DEVIN D15 JULIO H J42 JULIO J4 NGOC P N221 NGOC N22 TAMARI T5623 TAMARI T562 ;;;; run; proc print; run;
From: Nancy on 11 Mar 2010 14:42 Yes, that is the reason. I checked the code again. And found that I used the soudex function before I seperated the first name and middle for some names. Thank you very much! Xiaohong On Mar 11, 2:16 pm, "data _null_;" <datan...(a)gmail.com> wrote: > On Mar 11, 12:58 pm, Nancy <nancy0...(a)gmail.com> wrote: > > > > > > > I used the > > > IDF=soundex(first_name) > > > to get the soundex ID for the first name. > > > Is there anything wrong? > > > Thank you! > > > On Mar 11, 12:46 pm, "Lou" <lpog...(a)hotmail.com> wrote: > > > > From the description of the function in the documentation, "TAMARI" should > > > encode as T56 - if you're getting anything else, it would appear that you > > > have a problem. But whether it's a problem with your installation or your > > > code, we can't tell. It might be helpful if you posted an example of your > > > code. > > > > "Nancy" <nancy0...(a)gmail.com> wrote in message > > > >news:c6d64752-83c8-4db6-972f-138850277d73(a)a18g2000yqc.googlegroups.com.... > > > > > Hello All, > > > > > I just found a problem when using the soundex function. It singed > > > > different values for the same names in my data sets. I am wondering > > > > whether there is somehing wrong with my opreation or somthing else. > > > > > Thank you, > > > > > Please see the examples: > > > > > Obs First_name IDF > > > > > 1 TAMARI T5623 > > > > 2 TAMARI T56 > > > > 3 DEVIN D151 > > > > 4 DEVIN D15 > > > > 5 JULIO J42 > > > > 6 JULIO J4 > > > > 7 NGOC N221 > > > > 8 NGOC N22 > > > > 9 TAMARI T5623 > > > > 10 TAMARI T562- Hide quoted text - > > > > - Show quoted text -- Hide quoted text - > > > - Show quoted text - > > I ran the data you posted and got the same soundex values for each > pair. So I don't think the problem is SOUNDEX. But what could it > be? Different soundex values imply that some of the words were longer > but you see the names as being equal. I can think of one way that > could happen, I'm sure others can think of otherways. Perhaps the > NAMES are formatted with a format that does not display the entire > value. As in this example. > > data test; > input First_name $16. IDF $; > s = soundex(first_name); > s2 = soundex(scan(first_name,1,' ')); > format First_name $6.; > Name = First_name; > cards; > TAMARI J T5623 > TAMARI T56 > DEVIN S D151 > DEVIN D15 > JULIO H J42 > JULIO J4 > NGOC P N221 > NGOC N22 > TAMARI T5623 > TAMARI T562 > ;;;; > run; > proc print; > run;- Hide quoted text - > > - Show quoted text -
|
Next
|
Last
Pages: 1 2 Prev: what is the main purpose of informat? Next: Single Hurdle Poisson - MCMC Procedure |