From: "Kirby, Ted" on
We have a dataset that has up to nine ICD-9 diagnosis codes in each record. We want to identify the top 50 (or 100 or 250) ICD-9 and construct dummy variables for the existence of those codes.

Thanks to Dan Nordlund's reply to Trish Bous (at http://groups.google.com/group/comp.soft-sys.sas/browse_thread/thread/c82586e7f1164e54/ec5fc63a3eb16d9b?lnk=gst&q=proc+freq+order) we have the first part done (the identification of the top x codes and have saved them in a dataset. The first four observations of this dataset can be created with:

data TopX;
input code $5. count;
datalines;
5990 36163
4019 34429
2639 31573
5188126200
;
run;

How do we create the appropriate dummy variables? We would like to add variables to the dataset named dummyxxxxx for each of the top diagnosis codes that would have a value of 1 if the appropriate ICD-9 Diagnosis Code appears in any of the diagnosis code fields. We would use the TopX dataset to construct the names of the dummy variables, but setting each of the dummy variables would have to be done in the main dataset. For example, in the following dataset:

data have;
input patid (Dx1-Dx9) ($5);
datalines;
1 51881 5990 8052 85201 V550 V5419 73300 57420 5932
2 49121 51881 486 V850 41401 4019 7837 V462
3 51884 33520 34400 5070 5601 5990 V4611 33522 51909
4 03811 2639 2793 73021 V090 99591 07070 2384 311
5 56961 486 1962 4019 2749 27801 3051 V1005 78606
6 99832 73026 V4586 04111 496 42731 27800 V1259 71590
;
run;

We would want to add the dummy variables with the following values:

PatID Dummy5990 Dummy4019 Dummy2639 Dummy51881
1 1 0 0 1
2 1 0 0 1
3 1 0 0 0
4 0 0 1 0
5 0 1 0 0
6 0 0 0 0

Thanks for any assistance you can render.

************* IMPORTANT - PLEASE READ ********************

This e-mail, including attachments, may include confidential and/or proprietary information, and may be used only by the person or entity to which it is addressed. If the reader of this e-mail is not the intended recipient or his or her authorized agent, the reader is hereby notified that any dissemination, distribution or copying of this e-mail is prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and delete this e-mail immediately.