From: Al on 8 Apr 2010 16:41 Hi All: I have two adverse events data sets Ae_old and Ae_New .. I am looking to compare Aeterm (Term entered in the CRF) to see if there were any additions/deletions or changes in the AEterm in Ae_new from Ae_old dataset and create a flag in the Ae_new to know which data point was added (flag = 1) ,which data point was deleted (flag = 2) and which data point is changed(flag = 3) I am using this proc compare . proc compare base = ae_old compare = ae_new OUTNOEQUAL OUTCOMP OUTBASE /*OUTDIF*/ NOPRINT out = Ae_diff; id subject ; var aeterm ; run; but I am having difficulty to create a flags .Any suggestions Thanks in advance
From: xlr82sas on 8 Apr 2010 18:06 On Apr 8, 1:41 pm, Al <ali6...(a)gmail.com> wrote: > Hi All: > > I have two adverse events data sets Ae_old and Ae_New .. I am looking > to compare Aeterm (Term entered in the CRF) to see if there were any > additions/deletions or changes in the AEterm in Ae_new from Ae_old > dataset and create a flag in the Ae_new to know which data point was > added (flag = 1) ,which data point was deleted (flag = 2) and which > data point is changed(flag = 3) > > I am using this proc compare . > > proc compare base = ae_old compare = ae_new OUTNOEQUAL OUTCOMP > OUTBASE /*OUTDIF*/ NOPRINT out = Ae_diff; > id subject ; > var aeterm ; > run; > > but I am having difficulty to create a flags .Any suggestions > > Thanks in advance Hi, If you have a primary key in both datasets and keep only the key and aeterm the following code will identify inserts, deletes and updates For a cleaner version You might want to take a look at /* T003300 GIVEN TWO DATASETS CREATE AND APPLY THE TRANSACTION DATASET THAT WILL MAKE THE TABLES THE SAME http://homepage.mac.com/magdelina/.Public/utl.html utl_tipweb.txt OLD DATA | SSN JOB ACTIVE | | | | 001001110 carpenter1 YES | | 002001110 plumber1 YES | | 003001110 mason1 YES | | 004001110 plumber1 YES | | 005001110 electrician1 YES | | 006001110 mason3 YES | | 008001110 mason4 NO | | | | New table ( New data not in RDBMS ) | | (ie NUSSNJOB) | | | | SSN JOB ACTIVE | | | | 001001110 carpenter1 YES | | 002001110 plumber1 YES | | 003001110 painter1 YES | | 004001110 plumber1 YES | | 005001110 electrician1 YES | | 007001110 painter2 YES | Here is what the code gives you | Transaction data set to make onld data look like new data You will just have a primary key(SSN) and AETERM Type will tell you if a AETERM was added, deleted or updated. | | | | SSN JOB ACTIVE __TYPE | | | | 006001110 mason3 YES DELETE | | 008001110 mason4 NO DELETE | | 007001110 painter2 YES INSERT | | 003001110 painter1 YES UPDATE ======================================================================= *----------------------------------------------------------*\ | Make oldrep look exactly like newrep by applying | | the delta dateset using utldmod | | utl_update is the same as macro above | | Transaction dataset must contain __type variable | | with the value UPDATE, INSERT or DELETE | \*----------------------------------------------------------*/ %macro utl_update ( master= ,transaction= ,key= ) / des="Update Insert Delete in Sybase Table"; /*-------------------------------------------------*\ | This object applyes a transaction dataset to | | a master RDBMS table or SAS dataset. The result | | is an updated master ( RDBMS data ) | \*-------------------------------------------------*/ /*----------------------------------------------*\ | IPO | | Inputs | | ====== | | transaction - Transaction table must have | | variable type which can | | have 3 values (SAS dataset) | | DELETE -- Delete Row in Master | | INSERT -- Insert Row in Master | | UPDATE -- Update Row in Master | | key - unique index | | | | Process - Modify SAS/RDBMS Table | | ====== based on transaction file | | | | Output - master Modified | | ====== | | Transaction table can be in work library | \*----------------------------------------------*/ data &master; modify &master &transaction; by &key; select (_iorc_); when (%sysrc(_dsenmr)) /* nomatch in MASTER _Add */ do; if __type="INSERT" then output &master; *put "Nomatch" __type=; _error_=0; end; when (%sysrc(_sok)) /* matched Update or Delete */ do; *put "Matched " __type=; if __type="DELETE" then remove &master; else if __type="UPDATE" then replace &master; _error_=0; end; otherwise do; put "Unexpected ERROR Condition: _IORC_ =" _IORC_; _error_=0; end; end; run; %mend utl_update; %macro utl_delta ( uinmem1 =work.oldrep, /* Last Months Data */ uinmem2 =work.newrep, /* Current Month Data */ uinkey =rep_socs, /* primary unique key both tables */ uotmem1 =repdelta, /* delta tble for RDBMS update */ uotmem2 = repsame /* records that do not change */ ) / des = "Build delta SAS table for RDBMS update"; /*----------------------------------------------*\ | WIN95 SAS611 -- UNIX SAS611 SOLARIS 2.5 | | | | | | Create a delta ( transaction ) dataset | | based on a comparison of old and new data. | | | | Use this new delta dataset to bring older | | RDBMS table up to date. ( invoke utldmod ) | | | | Proc compare new feed against updated RDBMS | | Result should be an exact comparison | | | | You are probably asking why not just drop | | the RDBMS table and load the new table. | | | | 1. As a rule we keep SAS image tables | | of most read only RDBMS tables. | | And it is very easy to do detailed analysis| | such as this before any major sales | | representative alignment. | | | | 2. The delta dataset is very useful for QC, | | before updating RDBMS tables. | | | | 3. Table may reside in as many as three | | different databases on different platforms.| | (Oracle, Sybase, Watcom, MS-Access) | | This plays to SASes strength | | | | This code represents a slightly new | | methodology and as such has not been | | completely tested. CompuCraft would | | greatly appreciate any feedback. | \*----------------------------------------------*/ /*%^&*------------------------------------------*\ | Description: | | Create a delta ( transaction ) dataset | | based on new data. | | | | This code creates the delta ( transaction ) | | dataset. | | | | IPO | | INPUTS | | ====== | | Old table ( previous month - in RDBMS ) | | (ie OLSSNJOB) | | | | SSN JOB ACTIVE | | | | 001001110 carpenter1 YES | | 002001110 plumber1 YES | | 003001110 mason1 YES | | 004001110 plumber1 YES | | 005001110 electrician1 YES | | 006001110 mason3 YES | | 008001110 mason4 NO | | | | New table ( New data not in RDBMS ) | | (ie NUSSNJOB) | | | | SSN JOB ACTIVE | | | | 001001110 carpenter1 YES | | 002001110 plumber1 YES | | 003001110 painter1 YES | | 004001110 plumber1 YES | | 005001110 electrician1 YES | | 007001110 painter2 YES | | | | PROCESS | | ======= | | Extract the old data from the RDBMS. | | | | Union old and new data. Put an indicator | | variable on union. This variable | | identifies the origin of record ( old/new). | | | | Sort the union table on all fields. | | | | If record appears only in the old table | | then deactivate the record. ( DELETE ) | | | | If a record appears only in the new data | | ( at least one field makes record diff) | | and the key appears in both new and old | | data then perform an update ( UPDATE ) | | | | If a record appears only in the new data | | and the key does not appear in old table | | then perform an insert ( INSERT ) | | | | If the record is exactly the same in | | both tables then leave it alone | | | | OUTPUT | | ====== | | Transaction data set | | | | SSN JOB ACTIVE __TYPE | | | | 006001110 mason3 YES DELETE | | 008001110 mason4 NO DELETE | | 007001110 painter2 YES INSERT | | 003001110 painter1 YES UPDATE | | | \*%^&*------------------------------------------*/ /* for testing without macro %let uinmem1 =oldrep; %let uinmem2 =newrep; %let uinkey =rep_socs; %let uotmem1 =repdelta; %let uotmem2= repsame; */ %put %sysfunc(ifc(%sysevalf(%superq(uinmem1)=,boolean), **** Please Provide Previous dataset ,)); %put %sysfunc(ifc(%sysevalf(%superq(uinmem2)=,boolean), **** Please Provide Current dataset ,)); %put %sysfunc(ifc(%sysevalf(%superq(uinkey)=,boolean),**** Please Provide primary unique key both tables ,)); %put %sysfunc(ifc(%sysevalf(%superq(uotmem1)=,boolean),**** Please Provide transaction records dataset ,)); %put %sysfunc(ifc(%sysevalf(%superq(uotmem2)=,boolean), **** Please Provide records that do not change dataset,)); %if %eval( %sysfunc(ifc(%sysevalf(%superq(uinmem1)=,boolean),1,0)) + %sysfunc(ifc(%sysevalf(%superq(uinmem2)=,boolean),1,0)) + %sysfunc(ifc(%sysevalf(%superq(uinkey)=,boolean),1,0)) + %sysfunc(ifc(%sysevalf(%superq(uotmem1)=,boolean),1,0)) + %sysfunc(ifc(%sysevalf(%superq(uotmem2)=,boolean),1,0)) ) eq 0 %then %do; data utldlta1 ( label = "Union of current & previous month" ) / view=utldlta1; retain &uinkey; set &uinmem1 ( in = prevmnth ) &uinmem2 ( in = currmnth ); if prevmnth then __split = "OLD"; else __split = "NEW"; run; proc sql; select name into : ucols separated by ' ' from dictionary.columns where libname = %upcase('work') and memname = %upcase ( 'utldlta1' ) and name not eq %upcase( '__split' ); quit; %put ucols = &ucols; proc sort data = utldlta1 out = utldlta2; by &ucols __split; run; %let uwrds = %sysfunc(countw(&ucols)); %let ulastcol = %scan (&ucols, &uwrds ); data utldlta3 ( label = "Transactions" ) &uotmem2 ( label = "No change" ); set utldlta2; by &ucols; retain __split __type; select; /* Only in previous month */ when ( ( first.&ulastcol eq last.&ulastcol ) and __split = "OLD") do; if ( first.&uinkey eq last.&uinkey ) then do; /* unique record */ __type="DELETE"; /* key unique */ output utldlta3; /* OLD data only */ end; else do; /* unique record */ __type = "UPDATE"; /* key same */ output &uotmem2; /* other field changed */ end; end; /* Only in new month */ when ( first.&ulastcol eq last.&ulastcol ) do; /* unique record */ if ( first.&uinkey eq last.&uinkey ) then do; /* new key */ __type = "INSERT"; output utldlta3; end; else do; /* unique record */ __type = "UPDATE"; /* same key */ output utldlta3; /* new data only */ end; end; /* Duplicate records same in both months */ when ( first.&ulastcol ne last.&ulastcol ) do; /* duplicate record */ output &uotmem2; end; otherwise put "ERROR ===========> NEVER SAY NEVER <=============== ERROR"; end; run; proc sort data=utldlta3 out=&uotmem1 ( label = "Transaction dataset" index = ( &uinkey / unique ) ); by __type; run; %macro utl_nlobs(dsn); %let dsid=%sysfunc(open(&dsn));%sysfunc(attrn(&dsid,nlobs)) %let rc=%sysfunc(close(&dsid)); %mend utl_lnobs; proc print data=&uinmem1(obs=30) width=minimum noobs; title "Old RDBMS Data obs=%utl_nlobs(&uinmem1)"; run; proc print data=&uinmem2(obs=30) width=minimum noobs; title "New Feed Data obs=%utl_nlobs(&uinmem2)"; run; proc print data=&uotmem1(obs=30) width=minimum noobs; title "Transaction data set obs=%utl_nlobs(&uotmem1)"; run; proc print data=&uotmem2(obs=30 drop=__type) width=minimum noobs; title "No change data set obs=%utl_nlobs(&uotmem2)"; run; /*----------------------------------------------------------*\ | Make oldrep look exactly like newrep by applying | | the delta dateset using utldmod | \*----------------------------------------------------------*/ /* data oldrep; input rep_socs : $9. JOB : $16. ACTIVE : $3.; cards; 001001110 carpenter1 YES 002001110 plumber1 YES 003001110 mason1 YES 004001110 plumber1 YES 005001110 electrician1 YES 006001110 mason3 YES 008001110 mason4 NO ; run; data newrep; input rep_socs : $9. JOB : $16. ACTIVE : $3.; cards; 001001110 carpenter1 YES 002001110 plumber1 YES 003001110 painter1 YES 004001110 plumber1 YES 005001110 electrician1 YES 007001110 painter2 YES ; run; %utl_delta ( uinmem1 =oldrep, uinmem2 =newrep, uinkey =rep_socs, uotmem1 =repdelta, uotmem2= repsame ); %utl_delta; proc sort data=dat.oldrep out=ol;by rep_socs;run; proc sort data=dat.newrep out=nu;by rep_socs;run; title "Indepth comparison Updated RDMS table with New Feed"; proc compare data=ol compare=nu; run; */ %end; /* end macro argument checks */ %mend utl_delta; data oldrep; /* rep_socs is primary ket */ input rep_socs : $9. JOB : $16. ACTIVE : $3.; cards; 001001110 carpenter1 YES 002001110 plumber1 YES 003001110 mason1 YES 004001110 plumber1 YES 005001110 electrician1 YES 006001110 mason3 YES 008001110 mason4 NO ; run; data newrep; input rep_socs : $9. JOB : $16. ACTIVE : $3.; cards; 001001110 carpenter1 YES 002001110 plumber1 YES 003001110 painter1 YES 004001110 plumber1 YES 005001110 electrician1 YES 007001110 painter2 YES ; run; %utl_delta ( uinmem1 =oldrep, /* Last Months Data */ uinmem2 =newrep, /* Current Month Data */ uinkey =rep_socs, /* primary unique key both tables */ uotmem1 =repdelta, /* delta tble for RDBMS update */ uotmem2= repsame ); %utl_update ( master=oldrep ,transaction=repdelta ,key=rep_socs ); proc sort data=oldrep out=ol;by rep_socs;run; /* oldrep has been updated with transactions and now is equal to newrep */ proc sort data=newrep out=nu;by rep_socs;run; title "Indepth comparison Updated RDMS table with New Feed"; proc compare data=ol compare=nu; run;
From: Reeza on 8 Apr 2010 19:21 On Apr 8, 1:41 pm, Al <ali6...(a)gmail.com> wrote: > Hi All: > > I have two adverse events data sets Ae_old and Ae_New .. I am looking > to compare Aeterm (Term entered in the CRF) to see if there were any > additions/deletions or changes in the AEterm in Ae_new from Ae_old > dataset and create a flag in the Ae_new to know which data point was > added (flag = 1) ,which data point was deleted (flag = 2) and which > data point is changed(flag = 3) > > I am using this proc compare . > > proc compare base = ae_old compare = ae_new OUTNOEQUAL OUTCOMP > OUTBASE /*OUTDIF*/ NOPRINT out = Ae_diff; > id subject ; > var aeterm ; > run; > > but I am having difficulty to create a flags .Any suggestions > > Thanks in advance Use a merge instead? UNTESTED data new; merge ae_old (in=a) ae_new (in=b rename aeterm=aterm2); by id; if a and not b then flag=1; if b and not a then flag=2; if a and b and aeterm ne aeterm2 then flag=3; *assumes you renamed variable in new dataset (original =var1 new =var2); if flag=. then flag=4; *check if missing any; run; HTH, Reese
From: xlr82sas on 11 Apr 2010 14:40 On Apr 8, 4:21 pm, Reeza <fkhurs...(a)hotmail.com> wrote: > On Apr 8, 1:41 pm, Al <ali6...(a)gmail.com> wrote: > > > > > Hi All: > > > I have two adverse events data sets Ae_old and Ae_New .. I am looking > > to compare Aeterm (Term entered in the CRF) to see if there were any > > additions/deletions or changes in the AEterm in Ae_new from Ae_old > > dataset and create a flag in the Ae_new to know which data point was > > added (flag = 1) ,which data point was deleted (flag = 2) and which > > data point is changed(flag = 3) > > > I am using this proc compare . > > > proc compare base = ae_old compare = ae_new OUTNOEQUAL OUTCOMP > > OUTBASE /*OUTDIF*/ NOPRINT out = Ae_diff; > > id subject ; > > var aeterm ; > > run; > > > but I am having difficulty to create a flags .Any suggestions > > > Thanks in advance > > Use a merge instead? > > UNTESTED > > data new; > merge ae_old (in=a) ae_new (in=b rename aeterm=aterm2); > by id; > if a and not b then flag=1; > if b and not a then flag=2; > if a and b and aeterm ne aeterm2 then flag=3; *assumes you renamed > variable in new dataset (original =var1 new =var2); > if flag=. then flag=4; *check if missing any; > run; > > HTH, > Reese I little rework of Reese excellent post. I added it to my tips. thanks Reese. /* T005780 CHANGES IN LAST MONTHS CLASS DATASET - MARKING UPDATES, INSERTS AND DELETES IN STUDENT WEIGHTS - SAS-L REESE POST */ http://homepage.mac.com/magdelina/.Public/utl.html utl_tipweb.txt /* insert, update and delete for one variable from SAS-L Reese */ data jan2010; keep name weight; set sashelp.class; run; data feb2010; keep name weight; set sashelp.class end=dne; if name='John' then weight=100; if name='Jane' then delete; output; if dne then do; name='Zero'; weight=110; output; end; run; data dif; merge jan2010 (in=jan rename=weight=jan_weight) feb2010 (in=feb rename=weight=feb_weight); by name; select; when ( jan and not feb ) flg='DELETE'; when ( feb and not jan ) flg='INSERT'; when ( feb and jan and feb_weight ne jan_weight ) flg='UPDATE'; when ( jan and feb and feb_weight eq jan_weight ) flg=''; /* LEAVE OFF OTHERWISE TO FORCE AN HARD ERROR */ end; run; /* JAN_ FEB_ NAME WEIGHT WEIGHT FLG Alfred 112.5 112.5 Alice 84.0 84.0 Barbara 98.0 98.0 Carol 102.5 102.5 Henry 102.5 102.5 James 83.0 83.0 Jane 84.5 . DELETE Janet 112.5 112.5 Jeffrey 84.0 84.0 John 99.5 100.0 UPDATE Joyce 50.5 50.5 Judy 90.0 90.0 Louise 77.0 77.0 Mary 112.0 112.0 Philip 150.0 150.0 Robert 128.0 128.0 Ronald 133.0 133.0 Thomas 85.0 85.0 William 112.0 112.0 Zero . 110.0 INSERT */
|
Pages: 1 Prev: Time-Dependent Variables in Multivariable Analysis Next: calculating correlation |