Prev: assign a unique random integer to each unique id
Next: Problem in reading a password protected Excel File: Could not
From: Joe Matise on 15 Jan 2010 11:24 I believe that will generally work, but the size of your sample determines if it's likely to work 100%. You won't get 100% guarantee that it will be unique, because the period of the ranuni function is much larger than that; so periodically if reduced to a period of only 1e7 it will duplicate. 1e7 should be enough in my experience for a 1e4 order sample size. A safer way would be to generate a random number (not an integer, just the ranuni() itself), then sort by it, then use _N_ . That ensures a unique integer. -Joe On Fri, Jan 15, 2010 at 10:09 AM, Ai Hua Wang <aihuawang(a)yahoo.com> wrote: > Hi, > > I was wondering if anybody in this list could advise how I can assign a > unique random integer to each unique id. I have written the following code > but it does not allow me to get the unique random intergers. The 10000000 is > proportional to the size of the data set. Do I miss anything in the codes? > Please kindly provide your suggestions. > > Thanks, > Aihua > > > data temp; > set datset1; > urand=ceil(ranuni(1)*10000000); > run; > > > > __________________________________________________________________ > Be smarter than spam. See how smart SpamGuard is at giving junk email the > boot with the All-new Yahoo! Mail. Click on Options in Mail and switch to > New Mail today or register for free at http://mail.yahoo.ca >
From: Muthia Kachirayan on 15 Jan 2010 12:16 Aihua, The use of ranuni() produces non-unique numbers, it is possible to check whether the present random number has been selected earlier or not - and when selected - ignore the current and take a new random number. To help this process use an array as shown below. data _null_; if 0 then set sashelp.class nobs = n; call symputx('n',n); stop; run; data have; array k[&n] _temporary_; do _n_ = 1 to &n; do while(1); urand = ceil(ranuni(123) * &n); if k[urand] = . then do; k[urand] = 1; output; leave; end; end; end; run; data class; set have; set sashelp.class; run; proc print data = class; run; You may easily modify the 2 data steps into one data step. With regards, Muthia Kachirayan On Fri, Jan 15, 2010 at 12:09 PM, Ai Hua Wang <aihuawang(a)yahoo.com> wrote: > Hi, > > I was wondering if anybody in this list could advise how I can assign a > unique random integer to each unique id. I have written the following code > but it does not allow me to get the unique random intergers. The 10000000 is > proportional to the size of the data set. Do I miss anything in the codes? > Please kindly provide your suggestions. > > Thanks, > Aihua > > > data temp; > set datset1; > urand=ceil(ranuni(1)*10000000); > run; > > > > __________________________________________________________________ > Be smarter than spam. See how smart SpamGuard is at giving junk email the > boot with the All-new Yahoo! Mail. Click on Options in Mail and switch to > New Mail today or register for free at http://mail.yahoo.ca >
From: NordlDJ on 15 Jan 2010 20:11 > -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Ai > Hua Wang > Sent: Friday, January 15, 2010 8:10 AM > To: SAS-L(a)LISTSERV.UGA.EDU > Subject: assign a unique random integer to each unique id > > Hi, > > I was wondering if anybody in this list could advise how I can assign a unique > random integer to each unique id. I have written the following code but it does not > allow me to get the unique random intergers. The 10000000 is proportional to the > size of the data set. Do I miss anything in the codes? Please kindly provide your > suggestions. > > Thanks, > Aihua > > > data temp; > set datset1; > urand=ceil(ranuni(1)*10000000); > run; > > Aihua, Well, without knowing what you are going to use those "random" numbers for, it is hard to give good advice. Why does your multiplier need to be proportional to dataset size? Why do you want random integers assigned to your data? And why do they need to be unique? If you tell us more about what your actual needs are, we might be able to provide better help. That being said, the following code will assign unique integers to your data, as long as you have fewer than 2**31 - 1 records. data want; if _n_=1 then do; **----urand will be your random integer----**; urand=0; call ranuni(urand,dummy); **get a starting seed; put "original seed = " urand; **"save" starting seed to log; retain urand ; end; set datset1; call ranuni(urand,dummy); drop dummy; run; Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204
From: Patrick on 16 Jan 2010 00:30 Aihua I believe the code below does what you want. It's kind of a hash version of what Muthia does with an array. It's mainly about creating a hash table which contains all unique random integers which already have been used in earlier iterations of the data step. The code then loops until "urand" contains a new unique value. This value is in the end of a data step iteration added to the hash table - and therefore won't be used again later iterations of the data step. HTH Patrick data have; do var=1 to 20; output; end; run; data want (drop=rc multiplier); set have; retain multiplier 10; if _n_ =1 then do; declare hash UniqueRandomInteger(hashexp: 4); rc = UniqueRandomInteger.defineKey('urand'); rc = UniqueRandomInteger.defineData('urand'); rc = UniqueRandomInteger.defineDone(); end; do until (UniqueRandomInteger.check() ne 0); urand=ceil(ranuni(1)*multiplier); ind=sum(ind,1); if ind>100 then do; put "Too small set of possible unique random numbers. Multiplier will be increased"; put "Multiplier before: " multiplier; multiplier=multiplier*10; put "Multiplier increased: " multiplier; ind=0; end; end; rc = UniqueRandomInteger.add(); run; proc print data=want; run;
From: Ai Hua Wang on 16 Jan 2010 16:30
Hi Dan: =A0 Thank you very much for your thoughtful follow up. Please see my answers be= low. =A0 Why does your multiplier need to be proportional to dataset size?=A0 That is just my thought after I tried. Because when I use the smaller multi= plier I got much more duplicates. When I increase it I got less. Eventually= I found that it should be at least propotional to the size of the data set= .. =A0 Why do you want random integers assigned to your data?=A0 I need to use the assigned random integers as the unique id to allow the da= ta users to identify each unique record. I thought it is better to use the = integer than the decimal numbers. =A0 And why do they need to be unique?=A0 See above description and plus: It is used as the replacement of the sensitive information (unique id)=A0fo= r the privacy and confidentialiy concern. =A0 I hope this is helpful when you provide more insightful answers. =A0 Best Regards, Aihua =A0 =A0 --- On Sat, 1/16/10, Nordlund, Dan (DSHS/RDA) <NordlDJ(a)dshs.wa.gov> wrote: From: Nordlund, Dan (DSHS/RDA) <NordlDJ(a)dshs.wa.gov> Subject: RE: assign a unique random integer to each unique id To: "Ai Hua Wang" <aihuawang(a)YAHOO.COM>, SAS-L(a)LISTSERV.UGA.EDU Received: Saturday, January 16, 2010, 1:11 AM > -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Ai > Hua Wang > Sent: Friday, January 15, 2010 8:10 AM > To: SAS-L(a)LISTSERV.UGA.EDU > Subject: assign a unique random integer to each unique id >=20 > Hi, >=20 > I was wondering if anybody in this list could advise how I can assign a u= nique > random integer to each unique id. I have written the following code but i= t does not > allow me to get the unique random intergers. The 10000000 is proportional= to the > size of the data set. Do I miss anything in the codes? Please kindly prov= ide your > suggestions. >=20 > Thanks, > Aihua >=20 >=20 > data temp; > set datset1; > urand=3Dceil(ranuni(1)*10000000); > run; >=20 >=20 Aihua, Well, without knowing what you are going to use those "random" numbers for,= it is hard to give good advice.=A0 Why does your multiplier need to be pro= portional to dataset size?=A0 Why do you want random integers assigned to y= our data?=A0 And why do they need to be unique?=A0 If you tell us more abou= t what your actual needs are, we might be able to provide better help. That being said, the following code will assign unique integers to your dat= a, as long as you have fewer than 2**31 - 1 records. data want;=20 =A0 if _n_=3D1 then do; =A0 =A0 **----urand will be your random integer----**; =A0 =A0 urand=3D0; =A0 =A0 call ranuni(urand,dummy); **get a starting seed; =A0 =A0 put "original seed =3D " urand; **"save" starting seed to log; =A0 =A0 retain urand ;=20 =A0 end; =A0 set datset1; =A0 call ranuni(urand,dummy); drop dummy; run; Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA=A0 98504-5204 =0A=0A=0A ____________________________________________________________= ______=0AYahoo! Canada Toolbar: Search from anywhere on the web, and bookma= rk your favourite sites. Download it now=0Ahttp://ca.toolbar.yahoo.com. |