From: Al on
Hi All:


I have two adverse events data sets Ae_old and Ae_New .. I am looking
to compare Aeterm (Term entered in the CRF) to see if there were any
additions/deletions or changes in the AEterm in Ae_new from Ae_old
dataset and create a flag in the Ae_new to know which data point was
added (flag = 1) ,which data point was deleted (flag = 2) and which
data point is changed(flag = 3)

I am using this proc compare .


proc compare base = ae_old compare = ae_new OUTNOEQUAL OUTCOMP
OUTBASE /*OUTDIF*/ NOPRINT out = Ae_diff;
id subject ;
var aeterm ;
run;

but I am having difficulty to create a flags .Any suggestions

Thanks in advance

From: xlr82sas on
On Apr 8, 1:41 pm, Al <ali6...(a)gmail.com> wrote:
> Hi All:
>
> I have two adverse events data sets Ae_old and Ae_New .. I  am looking
> to compare Aeterm (Term entered in the CRF) to see if there were any
> additions/deletions or changes in the AEterm in Ae_new from Ae_old
> dataset and create a flag in the Ae_new  to know which data point was
> added (flag = 1) ,which data point was deleted (flag = 2) and which
> data point is changed(flag = 3)
>
> I am using this proc compare .
>
> proc compare base = ae_old compare = ae_new OUTNOEQUAL  OUTCOMP
> OUTBASE /*OUTDIF*/ NOPRINT out = Ae_diff;
>    id subject  ;
>    var aeterm ;
> run;
>
> but I am having difficulty to create a flags .Any suggestions
>
> Thanks in advance

Hi,

If you have a primary key in both datasets and keep only the key and
aeterm the following code will identify inserts, deletes and updates

For a cleaner version
You might want to take a look at

/* T003300 GIVEN TWO DATASETS CREATE AND APPLY THE TRANSACTION DATASET
THAT WILL MAKE THE TABLES THE SAME

http://homepage.mac.com/magdelina/.Public/utl.html
utl_tipweb.txt

OLD DATA
| SSN JOB ACTIVE |
| |
| 001001110 carpenter1 YES |
| 002001110 plumber1 YES |
| 003001110 mason1 YES |
| 004001110 plumber1 YES |
| 005001110 electrician1 YES |
| 006001110 mason3 YES |
| 008001110 mason4 NO |
| |
| New table ( New data not in RDBMS ) |
| (ie NUSSNJOB) |
| |
| SSN JOB ACTIVE |
| |
| 001001110 carpenter1 YES |
| 002001110 plumber1 YES |
| 003001110 painter1 YES |
| 004001110 plumber1 YES |
| 005001110 electrician1 YES |
| 007001110 painter2 YES |

Here is what the code gives you


| Transaction data set to make onld data look like new data

You will just have a primary key(SSN) and AETERM
Type will tell you if a AETERM was added, deleted or updated.


|
| |
| SSN JOB ACTIVE __TYPE |
| |
| 006001110 mason3 YES DELETE |
| 008001110 mason4 NO DELETE |
| 007001110 painter2 YES INSERT |
| 003001110 painter1 YES UPDATE








=======================================================================


*----------------------------------------------------------*\
| Make oldrep look exactly like newrep by applying |
| the delta dateset using utldmod |
| utl_update is the same as macro above |
| Transaction dataset must contain __type variable |
| with the value UPDATE, INSERT or DELETE |
\*----------------------------------------------------------*/
%macro utl_update
(
master=
,transaction=
,key=
)
/ des="Update Insert Delete in Sybase Table";

/*-------------------------------------------------*\
| This object applyes a transaction dataset to |
| a master RDBMS table or SAS dataset. The result |
| is an updated master ( RDBMS data ) |
\*-------------------------------------------------*/

/*----------------------------------------------*\
| IPO |
| Inputs |
| ====== |
| transaction - Transaction table must have |
| variable type which can |
| have 3 values (SAS dataset) |
| DELETE -- Delete Row in Master |
| INSERT -- Insert Row in Master |
| UPDATE -- Update Row in Master |
| key - unique index |
| |
| Process - Modify SAS/RDBMS Table |
| ====== based on transaction file |
| |
| Output - master Modified |
| ====== |
| Transaction table can be in work library |
\*----------------------------------------------*/

data &master;
modify &master &transaction;
by &key;
select (_iorc_);
when (%sysrc(_dsenmr)) /* nomatch in MASTER _Add */
do;
if __type="INSERT" then output &master;
*put "Nomatch" __type=;
_error_=0;
end;
when (%sysrc(_sok)) /* matched Update or Delete */
do;
*put "Matched " __type=;
if __type="DELETE" then remove &master;
else if __type="UPDATE" then replace &master;
_error_=0;
end;
otherwise
do;
put "Unexpected ERROR Condition: _IORC_ =" _IORC_;
_error_=0;
end;
end;
run;
%mend utl_update;

%macro utl_delta
(
uinmem1 =work.oldrep, /* Last Months Data */
uinmem2 =work.newrep, /* Current Month Data */
uinkey =rep_socs, /* primary unique key both tables */
uotmem1 =repdelta, /* delta tble for RDBMS update */
uotmem2 = repsame /* records that do not change */
)
/ des = "Build delta SAS table for RDBMS update";
/*----------------------------------------------*\
| WIN95 SAS611 -- UNIX SAS611 SOLARIS 2.5 |
| |
| |
| Create a delta ( transaction ) dataset |
| based on a comparison of old and new data. |
| |
| Use this new delta dataset to bring older |
| RDBMS table up to date. ( invoke utldmod ) |
| |
| Proc compare new feed against updated RDBMS |
| Result should be an exact comparison |
| |
| You are probably asking why not just drop |
| the RDBMS table and load the new table. |
| |
| 1. As a rule we keep SAS image tables |
| of most read only RDBMS tables. |
| And it is very easy to do detailed analysis|
| such as this before any major sales |
| representative alignment. |
| |
| 2. The delta dataset is very useful for QC, |
| before updating RDBMS tables. |
| |
| 3. Table may reside in as many as three |
| different databases on different platforms.|
| (Oracle, Sybase, Watcom, MS-Access) |
| This plays to SASes strength |
| |
| This code represents a slightly new |
| methodology and as such has not been |
| completely tested. CompuCraft would |
| greatly appreciate any feedback. |
\*----------------------------------------------*/

/*%^&*------------------------------------------*\
| Description: |
| Create a delta ( transaction ) dataset |
| based on new data. |
| |
| This code creates the delta ( transaction ) |
| dataset. |
| |
| IPO |
| INPUTS |
| ====== |
| Old table ( previous month - in RDBMS ) |
| (ie OLSSNJOB) |
| |
| SSN JOB ACTIVE |
| |
| 001001110 carpenter1 YES |
| 002001110 plumber1 YES |
| 003001110 mason1 YES |
| 004001110 plumber1 YES |
| 005001110 electrician1 YES |
| 006001110 mason3 YES |
| 008001110 mason4 NO |
| |
| New table ( New data not in RDBMS ) |
| (ie NUSSNJOB) |
| |
| SSN JOB ACTIVE |
| |
| 001001110 carpenter1 YES |
| 002001110 plumber1 YES |
| 003001110 painter1 YES |
| 004001110 plumber1 YES |
| 005001110 electrician1 YES |
| 007001110 painter2 YES |
| |
| PROCESS |
| ======= |
| Extract the old data from the RDBMS. |
| |
| Union old and new data. Put an indicator |
| variable on union. This variable |
| identifies the origin of record ( old/new). |
| |
| Sort the union table on all fields. |
| |
| If record appears only in the old table |
| then deactivate the record. ( DELETE ) |
| |
| If a record appears only in the new data |
| ( at least one field makes record diff) |
| and the key appears in both new and old |
| data then perform an update ( UPDATE ) |
| |
| If a record appears only in the new data |
| and the key does not appear in old table |
| then perform an insert ( INSERT ) |
| |
| If the record is exactly the same in |
| both tables then leave it alone |
| |
| OUTPUT |
| ====== |
| Transaction data set |
| |
| SSN JOB ACTIVE __TYPE |
| |
| 006001110 mason3 YES DELETE |
| 008001110 mason4 NO DELETE |
| 007001110 painter2 YES INSERT |
| 003001110 painter1 YES UPDATE |
| |
\*%^&*------------------------------------------*/

/* for testing without macro
%let uinmem1 =oldrep;
%let uinmem2 =newrep;
%let uinkey =rep_socs;
%let uotmem1 =repdelta;
%let uotmem2= repsame;
*/

%put %sysfunc(ifc(%sysevalf(%superq(uinmem1)=,boolean), **** Please
Provide Previous dataset ,));
%put %sysfunc(ifc(%sysevalf(%superq(uinmem2)=,boolean), **** Please
Provide Current dataset ,));
%put %sysfunc(ifc(%sysevalf(%superq(uinkey)=,boolean),**** Please
Provide primary unique key both tables ,));
%put %sysfunc(ifc(%sysevalf(%superq(uotmem1)=,boolean),**** Please
Provide transaction records dataset ,));
%put %sysfunc(ifc(%sysevalf(%superq(uotmem2)=,boolean), **** Please
Provide records that do not change dataset,));

%if %eval(
%sysfunc(ifc(%sysevalf(%superq(uinmem1)=,boolean),1,0)) +
%sysfunc(ifc(%sysevalf(%superq(uinmem2)=,boolean),1,0)) +
%sysfunc(ifc(%sysevalf(%superq(uinkey)=,boolean),1,0)) +
%sysfunc(ifc(%sysevalf(%superq(uotmem1)=,boolean),1,0)) +
%sysfunc(ifc(%sysevalf(%superq(uotmem2)=,boolean),1,0))
) eq 0 %then %do;

data utldlta1
(
label = "Union of current & previous month"
)
/ view=utldlta1;
retain &uinkey;
set &uinmem1 ( in = prevmnth )
&uinmem2 ( in = currmnth );
if prevmnth then __split = "OLD";
else __split = "NEW";
run;
proc sql;
select
name into : ucols separated by ' '
from
dictionary.columns
where
libname = %upcase('work') and
memname = %upcase ( 'utldlta1' ) and
name not eq %upcase( '__split' );
quit;
%put ucols = &ucols;
proc sort data = utldlta1 out = utldlta2;
by &ucols __split;
run;
%let uwrds = %sysfunc(countw(&ucols));
%let ulastcol = %scan (&ucols, &uwrds );
data utldlta3
(
label = "Transactions"
)
&uotmem2
(
label = "No change"
);
set utldlta2;
by &ucols;
retain __split __type;
select;
/* Only in previous month */
when ( ( first.&ulastcol eq last.&ulastcol ) and __split =
"OLD") do;
if ( first.&uinkey eq last.&uinkey ) then do; /*
unique record */
__type="DELETE"; /* key
unique */
output utldlta3; /* OLD
data only */
end;
else do; /*
unique record */
__type = "UPDATE"; /* key
same */
output &uotmem2; /* other
field changed */
end;
end;
/* Only in new month */
when ( first.&ulastcol eq last.&ulastcol ) do; /*
unique record */
if ( first.&uinkey eq last.&uinkey ) then do; /* new
key */
__type = "INSERT";
output utldlta3;
end;
else do; /*
unique record */
__type = "UPDATE"; /* same
key */
output utldlta3; /* new
data only */
end;
end;

/* Duplicate records same in both months */
when ( first.&ulastcol ne last.&ulastcol ) do; /*
duplicate record */
output &uotmem2;
end;
otherwise put "ERROR ===========> NEVER SAY NEVER
<=============== ERROR";
end;
run;
proc sort data=utldlta3
out=&uotmem1
(
label = "Transaction dataset"
index = ( &uinkey / unique )
);
by __type;
run;
%macro utl_nlobs(dsn);
%let dsid=%sysfunc(open(&dsn));%sysfunc(attrn(&dsid,nlobs))
%let rc=%sysfunc(close(&dsid));
%mend utl_lnobs;
proc print data=&uinmem1(obs=30) width=minimum noobs;
title "Old RDBMS Data obs=%utl_nlobs(&uinmem1)";
run;
proc print data=&uinmem2(obs=30) width=minimum noobs;
title "New Feed Data obs=%utl_nlobs(&uinmem2)";
run;
proc print data=&uotmem1(obs=30) width=minimum noobs;
title "Transaction data set obs=%utl_nlobs(&uotmem1)";
run;
proc print data=&uotmem2(obs=30 drop=__type) width=minimum noobs;
title "No change data set obs=%utl_nlobs(&uotmem2)";
run;
/*----------------------------------------------------------*\
| Make oldrep look exactly like newrep by applying |
| the delta dateset using utldmod |
\*----------------------------------------------------------*/

/*

data oldrep;
input rep_socs : $9. JOB : $16. ACTIVE : $3.;
cards;
001001110 carpenter1 YES
002001110 plumber1 YES
003001110 mason1 YES
004001110 plumber1 YES
005001110 electrician1 YES
006001110 mason3 YES
008001110 mason4 NO
;
run;
data newrep;
input rep_socs : $9. JOB : $16. ACTIVE : $3.;
cards;
001001110 carpenter1 YES
002001110 plumber1 YES
003001110 painter1 YES
004001110 plumber1 YES
005001110 electrician1 YES
007001110 painter2 YES
;
run;

%utl_delta
(
uinmem1 =oldrep,
uinmem2 =newrep,
uinkey =rep_socs,
uotmem1 =repdelta,
uotmem2= repsame
);
%utl_delta;

proc sort data=dat.oldrep out=ol;by rep_socs;run;
proc sort data=dat.newrep out=nu;by rep_socs;run;

title "Indepth comparison Updated RDMS table with New Feed";

proc compare data=ol compare=nu;
run;
*/
%end; /* end macro argument checks */

%mend utl_delta;

data oldrep; /* rep_socs is primary ket */
input rep_socs : $9. JOB : $16. ACTIVE : $3.;
cards;
001001110 carpenter1 YES
002001110 plumber1 YES
003001110 mason1 YES
004001110 plumber1 YES
005001110 electrician1 YES
006001110 mason3 YES
008001110 mason4 NO
;
run;

data newrep;
input rep_socs : $9. JOB : $16. ACTIVE : $3.;
cards;
001001110 carpenter1 YES
002001110 plumber1 YES
003001110 painter1 YES
004001110 plumber1 YES
005001110 electrician1 YES
007001110 painter2 YES
;
run;

%utl_delta
(
uinmem1 =oldrep, /* Last Months Data */
uinmem2 =newrep, /* Current Month Data */
uinkey =rep_socs, /* primary unique key both tables */
uotmem1 =repdelta, /* delta tble for RDBMS update */
uotmem2= repsame
);

%utl_update
(
master=oldrep
,transaction=repdelta
,key=rep_socs
);

proc sort data=oldrep out=ol;by rep_socs;run; /* oldrep has been
updated with transactions and now is equal to newrep */
proc sort data=newrep out=nu;by rep_socs;run;
title "Indepth comparison Updated RDMS table with New Feed";

proc compare data=ol compare=nu;
run;



From: Reeza on
On Apr 8, 1:41 pm, Al <ali6...(a)gmail.com> wrote:
> Hi All:
>
> I have two adverse events data sets Ae_old and Ae_New .. I  am looking
> to compare Aeterm (Term entered in the CRF) to see if there were any
> additions/deletions or changes in the AEterm in Ae_new from Ae_old
> dataset and create a flag in the Ae_new  to know which data point was
> added (flag = 1) ,which data point was deleted (flag = 2) and which
> data point is changed(flag = 3)
>
> I am using this proc compare .
>
> proc compare base = ae_old compare = ae_new OUTNOEQUAL  OUTCOMP
> OUTBASE /*OUTDIF*/ NOPRINT out = Ae_diff;
>    id subject  ;
>    var aeterm ;
> run;
>
> but I am having difficulty to create a flags .Any suggestions
>
> Thanks in advance

Use a merge instead?

UNTESTED

data new;
merge ae_old (in=a) ae_new (in=b rename aeterm=aterm2);
by id;
if a and not b then flag=1;
if b and not a then flag=2;
if a and b and aeterm ne aeterm2 then flag=3; *assumes you renamed
variable in new dataset (original =var1 new =var2);
if flag=. then flag=4; *check if missing any;
run;

HTH,
Reese

From: xlr82sas on
On Apr 8, 4:21 pm, Reeza <fkhurs...(a)hotmail.com> wrote:
> On Apr 8, 1:41 pm, Al <ali6...(a)gmail.com> wrote:
>
>
>
> > Hi All:
>
> > I have two adverse events data sets Ae_old and Ae_New .. I  am looking
> > to compare Aeterm (Term entered in the CRF) to see if there were any
> > additions/deletions or changes in the AEterm in Ae_new from Ae_old
> > dataset and create a flag in the Ae_new  to know which data point was
> > added (flag = 1) ,which data point was deleted (flag = 2) and which
> > data point is changed(flag = 3)
>
> > I am using this proc compare .
>
> > proc compare base = ae_old compare = ae_new OUTNOEQUAL  OUTCOMP
> > OUTBASE /*OUTDIF*/ NOPRINT out = Ae_diff;
> >    id subject  ;
> >    var aeterm ;
> > run;
>
> > but I am having difficulty to create a flags .Any suggestions
>
> > Thanks in advance
>
> Use a merge instead?
>
> UNTESTED
>
> data new;
> merge ae_old (in=a) ae_new (in=b rename aeterm=aterm2);
> by id;
> if a and not b then flag=1;
> if b and not a then flag=2;
> if a and b and aeterm ne aeterm2 then flag=3; *assumes you renamed
> variable in new dataset (original =var1 new =var2);
> if flag=. then flag=4; *check if missing any;
> run;
>
> HTH,
> Reese

I little rework of Reese excellent post. I added it to my tips. thanks
Reese.

/* T005780 CHANGES IN LAST MONTHS CLASS DATASET - MARKING UPDATES,
INSERTS AND DELETES IN STUDENT WEIGHTS - SAS-L REESE POST */
http://homepage.mac.com/magdelina/.Public/utl.html
utl_tipweb.txt

/* insert, update and delete for one variable from SAS-L Reese */
data jan2010;
keep name weight;
set sashelp.class;
run;

data feb2010;
keep name weight;
set sashelp.class end=dne;
if name='John' then weight=100;
if name='Jane' then delete;
output;
if dne then do; name='Zero'; weight=110; output; end;
run;

data dif;
merge jan2010 (in=jan rename=weight=jan_weight)
feb2010 (in=feb rename=weight=feb_weight);
by name;
select;
when ( jan and not feb ) flg='DELETE';
when ( feb and not jan ) flg='INSERT';
when ( feb and jan and feb_weight ne jan_weight ) flg='UPDATE';
when ( jan and feb and feb_weight eq jan_weight ) flg='';
/* LEAVE OFF OTHERWISE TO FORCE AN HARD ERROR */
end;
run;

/*
JAN_ FEB_
NAME WEIGHT WEIGHT FLG

Alfred 112.5 112.5
Alice 84.0 84.0
Barbara 98.0 98.0
Carol 102.5 102.5
Henry 102.5 102.5
James 83.0 83.0
Jane 84.5 . DELETE
Janet 112.5 112.5
Jeffrey 84.0 84.0
John 99.5 100.0 UPDATE
Joyce 50.5 50.5
Judy 90.0 90.0
Louise 77.0 77.0
Mary 112.0 112.0
Philip 150.0 150.0
Robert 128.0 128.0
Ronald 133.0 133.0
Thomas 85.0 85.0
William 112.0 112.0
Zero . 110.0 INSERT
*/