On Apr 8, 1:41=A0pm, Al <ali6...@gmail.com> wrote:
> Hi All:
>
> I have two adverse events data sets Ae_old and Ae_New .. I =A0am looking
> to compare Aeterm (Term entered in the CRF) to see if there were any
> additions/deletions or changes in the AEterm in Ae_new from Ae_old
> dataset and create a flag in the Ae_new =A0to know which data point was
> added (flag =3D 1) ,which data point was deleted (flag =3D 2) and which
> data point is changed(flag =3D 3)
>
> I am using this proc compare .
>
> proc compare base =3D ae_old compare =3D ae_new OUTNOEQUAL =A0OUTCOMP
> OUTBASE /*OUTDIF*/ NOPRINT out =3D Ae_diff;
> =A0 =A0id subject =A0;
> =A0 =A0var aeterm ;
> run;
>
> but I am having difficulty to create a flags .Any suggestions
>
> Thanks in advance
Hi,
If you have a primary key in both datasets and keep only the key and
aeterm the following code will identify inserts, deletes and updates
For a cleaner version
You might want to take a look at
/* T003300 GIVEN TWO DATASETS CREATE AND APPLY THE TRANSACTION DATASET
THAT WILL MAKE THE TABLES THE SAME
http://homepage.mac.com/magdelina/.Public/utl.html
utl_tipweb.txt
OLD DATA
| SSN JOB ACTIVE |
| |
| 001001110 carpenter1 YES |
| 002001110 plumber1 YES |
| 003001110 mason1 YES |
| 004001110 plumber1 YES |
| 005001110 electrician1 YES |
| 006001110 mason3 YES |
| 008001110 mason4 NO |
| |
| New table ( New data not in RDBMS ) |
| (ie NUSSNJOB) |
| |
| SSN JOB ACTIVE |
| |
| 001001110 carpenter1 YES |
| 002001110 plumber1 YES |
| 003001110 painter1 YES |
| 004001110 plumber1 YES |
| 005001110 electrician1 YES |
| 007001110 painter2 YES |
Here is what the code gives you
| Transaction data set to make onld data look like new data
You will just have a primary key(SSN) and AETERM
Type will tell you if a AETERM was added, deleted or updated.
|
| |
| SSN JOB ACTIVE __TYPE |
| |
| 006001110 mason3 YES DELETE |
| 008001110 mason4 NO DELETE |
| 007001110 painter2 YES INSERT |
| 003001110 painter1 YES UPDATE
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
*----------------------------------------------------------*\
| Make oldrep look exactly like newrep by applying |
| the delta dateset using utldmod |
| utl_update is the same as macro above |
| Transaction dataset must contain __type variable |
| with the value UPDATE, INSERT or DELETE |
\*----------------------------------------------------------*/
%macro utl_update
(
master=3D
,transaction=3D
,key=3D
)
/ des=3D"Update Insert Delete in Sybase Table";
/*-------------------------------------------------*\
| This object applyes a transaction dataset to |
| a master RDBMS table or SAS dataset. The result |
| is an updated master ( RDBMS data ) |
\*-------------------------------------------------*/
/*----------------------------------------------*\
| IPO |
| Inputs |
| =3D=3D=3D=3D=3D=3D |
| transaction - Transaction table must have |
| variable type which can |
| have 3 values (SAS dataset) |
| DELETE -- Delete Row in Master |
| INSERT -- Insert Row in Master |
| UPDATE -- Update Row in Master |
| key - unique index |
| |
| Process - Modify SAS/RDBMS Table |
| =3D=3D=3D=3D=3D=3D based on transaction file |
| |
| Output - master Modified |
| =3D=3D=3D=3D=3D=3D |
| Transaction table can be in work library |
\*----------------------------------------------*/
data &master;
modify &master &transaction;
by &key;
select (_iorc_);
when (%sysrc(_dsenmr)) /* nomatch in MASTER _Add */
do;
if __type=3D"INSERT" then output &master;
*put "Nomatch" __type=3D;
_error_=3D0;
end;
when (%sysrc(_sok)) /* matched Update or Delete */
do;
*put "Matched " __type=3D;
if __type=3D"DELETE" then remove &master;
else if __type=3D"UPDATE" then replace &master;
_error_=3D0;
end;
otherwise
do;
put "Unexpected ERROR Condition: _IORC_ =3D" _IORC_;
_error_=3D0;
end;
end;
run;
%mend utl_update;
%macro utl_delta
(
uinmem1 =3Dwork.oldrep, /* Last Months Data */
uinmem2 =3Dwork.newrep, /* Current Month Data */
uinkey =3Drep_socs, /* primary unique key both tables */
uotmem1 =3Drepdelta, /* delta tble for RDBMS update */
uotmem2 =3D repsame /* records that do not change */
)
/ des =3D "Build delta SAS table for RDBMS update";
/*----------------------------------------------*\
| WIN95 SAS611 -- UNIX SAS611 SOLARIS 2.5 |
| |
| |
| Create a delta ( transaction ) dataset |
| based on a comparison of old and new data. |
| |
| Use this new delta dataset to bring older |
| RDBMS table up to date. ( invoke utldmod ) |
| |
| Proc compare new feed against updated RDBMS |
| Result should be an exact comparison |
| |
| You are probably asking why not just drop |
| the RDBMS table and load the new table. |
| |
| 1. As a rule we keep SAS image tables |
| of most read only RDBMS tables. |
| And it is very easy to do detailed analysis|
| such as this before any major sales |
| representative alignment. |
| |
| 2. The delta dataset is very useful for QC, |
| before updating RDBMS tables. |
| |
| 3. Table may reside in as many as three |
| different databases on different platforms.|
| (Oracle, Sybase, Watcom, MS-Access) |
| This plays to SASes strength |
| |
| This code represents a slightly new |
| methodology and as such has not been |
| completely tested. CompuCraft would |
| greatly appreciate any feedback. |
\*----------------------------------------------*/
/*%^&*------------------------------------------*\
| Description: |
| Create a delta ( transaction ) dataset |
| based on new data. |
| |
| This code creates the delta ( transaction ) |
| dataset. |
| |
| IPO |
| INPUTS |
| =3D=3D=3D=3D=3D=3D |
| Old table ( previous month - in RDBMS ) |
| (ie OLSSNJOB) |
| |
| SSN JOB ACTIVE |
| |
| 001001110 carpenter1 YES |
| 002001110 plumber1 YES |
| 003001110 mason1 YES |
| 004001110 plumber1 YES |
| 005001110 electrician1 YES |
| 006001110 mason3 YES |
| 008001110 mason4 NO |
| |
| New table ( New data not in RDBMS ) |
| (ie NUSSNJOB) |
| |
| SSN JOB ACTIVE |
| |
| 001001110 carpenter1 YES |
| 002001110 plumber1 YES |
| 003001110 painter1 YES |
| 004001110 plumber1 YES |
| 005001110 electrician1 YES |
| 007001110 painter2 YES |
| |
| PROCESS |
| =3D=3D=3D=3D=3D=3D=3D |
| Extract the old data from the RDBMS. |
| |
| Union old and new data. Put an indicator |
| variable on union. This variable |
| identifies the origin of record ( old/new). |
| |
| Sort the union table on all fields. |
| |
| If record appears only in the old table |
| then deactivate the record. ( DELETE ) |
| |
| If a record appears only in the new data |
| ( at least one field makes record diff) |
| and the key appears in both new and old |
| data then perform an update ( UPDATE ) |
| |
| If a record appears only in the new data |
| and the key does not appear in old table |
| then perform an insert ( INSERT ) |
| |
| If the record is exactly the same in |
| both tables then leave it alone |
| |
| OUTPUT |
| =3D=3D=3D=3D=3D=3D |
| Transaction data set |
| |
| SSN JOB ACTIVE __TYPE |
| |
| 006001110 mason3 YES DELETE |
| 008001110 mason4 NO DELETE |
| 007001110 painter2 YES INSERT |
| 003001110 painter1 YES UPDATE |
| |
\*%^&*------------------------------------------*/
/* for testing without macro
%let uinmem1 =3Doldrep;
%let uinmem2 =3Dnewrep;
%let uinkey =3Drep_socs;
%let uotmem1 =3Drepdelta;
%let uotmem2=3D repsame;
*/
%put %sysfunc(ifc(%sysevalf(%superq(uinmem1)=3D,boolean), **** Please
Provide Previous dataset ,));
%put %sysfunc(ifc(%sysevalf(%superq(uinmem2)=3D,boolean), **** Please
Provide Current dataset ,));
%put %sysfunc(ifc(%sysevalf(%superq(uinkey)=3D,boolean),**** Please
Provide primary unique key both tables ,));
%put %sysfunc(ifc(%sysevalf(%superq(uotmem1)=3D,boolean),**** Please
Provide transaction records dataset ,));
%put %sysfunc(ifc(%sysevalf(%superq(uotmem2)=3D,boolean), **** Please
Provide records that do not change dataset,));
%if %eval(
%sysfunc(ifc(%sysevalf(%superq(uinmem1)=3D,boolean),1,0)) +
%sysfunc(ifc(%sysevalf(%superq(uinmem2)=3D,boolean),1,0)) +
%sysfunc(ifc(%sysevalf(%superq(uinkey)=3D,boolean),1,0)) +
%sysfunc(ifc(%sysevalf(%superq(uotmem1)=3D,boolean),1,0)) +
%sysfunc(ifc(%sysevalf(%superq(uotmem2)=3D,boolean),1,0))
) eq 0 %then %do;
data utldlta1
(
label =3D "Union of current & previous month"
)
/ view=3Dutldlta1;
retain &uinkey;
set &uinmem1 ( in =3D prevmnth )
&uinmem2 ( in =3D currmnth );
if prevmnth then __split =3D "OLD";
else __split =3D "NEW";
run;
proc sql;
select
name into : ucols separated by ' '
from
dictionary.columns
where
libname =3D %upcase('work') and
memname =3D %upcase ( 'utldlta1' ) and
name not eq %upcase( '__split' );
quit;
%put ucols =3D &ucols;
proc sort data =3D utldlta1 out =3D utldlta2;
by &ucols __split;
run;
%let uwrds =3D %sysfunc(countw(&ucols));
%let ulastcol =3D %scan (&ucols, &uwrds );
data utldlta3
(
label =3D "Transactions"
)
&uotmem2
(
label =3D "No change"
);
set utldlta2;
by &ucols;
retain __split __type;
select;
/* Only in previous month */
when ( ( first.&ulastcol eq last.&ulastcol ) and __split =3D
"OLD") do;
if ( first.&uinkey eq last.&uinkey ) then do; /*
unique record */
__type=3D"DELETE"; /* key
unique */
output utldlta3; /* OLD
data only */
end;
else do; /*
unique record */
__type =3D "UPDATE"; /* key
same */
output &uotmem2; /* other
field changed */
end;
end;
/* Only in new month */
when ( first.&ulastcol eq last.&ulastcol ) do; /*
unique record */
if ( first.&uinkey eq last.&uinkey ) then do; /* new
key */
__type =3D "INSERT";
output utldlta3;
end;
else do; /*
unique record */
__type =3D "UPDATE"; /* same
key */
output utldlta3; /* new
data only */
end;
end;
/* Duplicate records same in both months */
when ( first.&ulastcol ne last.&ulastcol ) do; /*
duplicate record */
output &uotmem2;
end;
otherwise put "ERROR =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D> NEVER SAY N=
EVER
<=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D ERROR";
end;
run;
proc sort data=3Dutldlta3
out=3D&uotmem1
(
label =3D "Transaction dataset"
index =3D ( &uinkey / unique )
);
by __type;
run;
%macro utl_nlobs(dsn);
%let dsid=3D%sysfunc(open(&dsn));%sysfunc(attrn(&dsid,nlobs))
%let rc=3D%sysfunc(close(&dsid));
%mend utl_lnobs;
proc print data=3D&uinmem1(obs=3D30) width=3Dminimum noobs;
title "Old RDBMS Data obs=3D%utl_nlobs(&uinmem1)";
run;
proc print data=3D&uinmem2(obs=3D30) width=3Dminimum noobs;
title "New Feed Data obs=3D%utl_nlobs(&uinmem2)";
run;
proc print data=3D&uotmem1(obs=3D30) width=3Dminimum noobs;
title "Transaction data set obs=3D%utl_nlobs(&uotmem1)";
run;
proc print data=3D&uotmem2(obs=3D30 drop=3D__type) width=3Dminimum noob=
s;
title "No change data set obs=3D%utl_nlobs(&uotmem2)";
run;
/*----------------------------------------------------------*\
| Make oldrep look exactly like newrep by applying |
| the delta dateset using utldmod |
\*----------------------------------------------------------*/
/*
data oldrep;
input rep_socs : $9. JOB : $16. ACTIVE : $3.;
cards;
001001110 carpenter1 YES
002001110 plumber1 YES
003001110 mason1 YES
004001110 plumber1 YES
005001110 electrician1 YES
006001110 mason3 YES
008001110 mason4 NO
;
run;
data newrep;
input rep_socs : $9. JOB : $16. ACTIVE : $3.;
cards;
001001110 carpenter1 YES
002001110 plumber1 YES
003001110 painter1 YES
004001110 plumber1 YES
005001110 electrician1 YES
007001110 painter2 YES
;
run;
%utl_delta
(
uinmem1 =3Doldrep,
uinmem2 =3Dnewrep,
uinkey =3Drep_socs,
uotmem1 =3Drepdelta,
uotmem2=3D repsame
);
%utl_delta;
proc sort data=3Ddat.oldrep out=3Dol;by rep_socs;run;
proc sort data=3Ddat.newrep out=3Dnu;by rep_socs;run;
title "Indepth comparison Updated RDMS table with New Feed";
proc compare data=3Dol compare=3Dnu;
run;
*/
%end; /* end macro argument checks */
%mend utl_delta;
data oldrep; /* rep_socs is primary ket */
input rep_socs : $9. JOB : $16. ACTIVE : $3.;
cards;
001001110 carpenter1 YES
002001110 plumber1 YES
003001110 mason1 YES
004001110 plumber1 YES
005001110 electrician1 YES
006001110 mason3 YES
008001110 mason4 NO
;
run;
data newrep;
input rep_socs : $9. JOB : $16. ACTIVE : $3.;
cards;
001001110 carpenter1 YES
002001110 plumber1 YES
003001110 painter1 YES
004001110 plumber1 YES
005001110 electrician1 YES
007001110 painter2 YES
;
run;
%utl_delta
(
uinmem1 =3Doldrep, /* Last Months Data */
uinmem2 =3Dnewrep, /* Current Month Data */
uinkey =3Drep_socs, /* primary unique key both tables */
uotmem1 =3Drepdelta, /* delta tble for RDBMS update */
uotmem2=3D repsame
);
%utl_update
(
master=3Doldrep
,transaction=3Drepdelta
,key=3Drep_socs
);
proc sort data=3Doldrep out=3Dol;by rep_socs;run; /* oldrep has been
updated with transactions and now is equal to newrep */
proc sort data=3Dnewrep out=3Dnu;by rep_socs;run;
title "Indepth comparison Updated RDMS table with New Feed";
proc compare data=3Dol compare=3Dnu;
run;
|