From: Trish Bous on
Typo in my message:

I WOULD LIKE TO PULL ANY CODE THAT BEGINS WITH 172, 173, 232, OR 7020

Thanks!
From: Proc Me on
Trish,

I wonder if in the first obs in your example, the 00000 (non-neoplasm),
which doesn't match the conditions, is overwriting the fact that earlier
fields are neoplasm?

You might try changing your conditional to:

neoplasm = 0;

do i = 1 to 3;
if substr (prx {i}, 1, 3) in ('172', '173', '232') or
substr(prx{i}, 1, 4) in ('7020') then neoplasm + 1;
end;

neoplasm = ceil(neoplasm / 3);

if neoplasm;

Does that help?

For the second part could you:

create table want
as
select code, sum(volume) as count
from (
select 1 as diagnostic,
diag1 as code,
count(*) as volume
from have
group by 2
union
select 2 as diagnostic,
diag2 as code,
count(*) as volume
from have
group by 2
union
select 3 as diagnostic,
diag3 as code,
count(*) as volume
from have
group by 2
) totals
group by 1

Proc Me
On Wed, 9 Dec 2009 17:04:22 -0500, Trish Bous <tboussard(a)GMAIL.COM> wrote:

>HI All,
>
>I AM TRYING TO SELECT A BUNCH OF ICD-9 CODES THAT ARE CODED IN 3 DIFFERENT
>VARIABLES. I WOULD LIKE TO SELECT A FEW CODES TO PULL AND THEN GET A COUNT
>OF HOW MANY TIMES EACH CODE APPEARS.
>
>THE DATA THAT I PARSING LOOK LIKE THIS
>
>DIAG1 DIAG2 DIAG3
>1729- 1721- 00000
>1730- 1729- 172--
>1731- 1730- 1721
>1732- 2045-
>1735- 31000
>1739- 7020-
>1950-
>20010
>2104-
>
>etc...
From: Nathaniel Wooding on
Trish

Try the following code.

Nat Wooding

Data codes;
informat DIAG1 DIAG2 DIAG3 $6.;
INFILE CARDS MISSOVER;
input DIAG1 DIAG2 DIAG3;
CARDS;
1729- 1721- 00000
1730- 1729- 172--
1731- 1730- 1721
1732- 2045-
1735- 31000
1739- 7020-
1950-
20010
2104-
RUN;
Data codes;
set codes;
line+1;* get a line number which will make the following work;
run;
Proc Transpose out = codes;* normalize the file for simplicity;
var diag: ;
by line;
run;
Data Final;
set codes;
if col1 gt '';
Prefix = Substr( col1 , 1 , 3 );
if prefix in( '172', '173', '232') then neoplasm=1;else
if col1 =: '7020' then neoplasm =1; * test the first 4 bytes;
else neoplasm=0;
run;


-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Trish Bous
Sent: Wednesday, December 09, 2009 5:04 PM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: REGULAR EXPRESSIONS

HI All,

I AM TRYING TO SELECT A BUNCH OF ICD-9 CODES THAT ARE CODED IN 3 DIFFERENT
VARIABLES. I WOULD LIKE TO SELECT A FEW CODES TO PULL AND THEN GET A COUNT
OF HOW MANY TIMES EACH CODE APPEARS.

THE DATA THAT I PARSING LOOK LIKE THIS

DIAG1 DIAG2 DIAG3
1729- 1721- 00000
1730- 1729- 172--
1731- 1730- 1721
1732- 2045-
1735- 31000
1739- 7020-
1950-
20010
2104-

etc...

I WOULD LIKE TO PULL ANY CODE THAT BEGINS WITH 172, 172, 173, OR 7020

THE CODE I HAVE WRITEN LOOKS LIKE THIS:

array prx (3) diag1-diag3;
do i = 1 to 3;
if substr (prx {i}, 1, 3) ('172', '173', '232') then neoplasm=1;
else if prx{i} in ('7020') then neoplasm =1;
else neoplasm = 0;
end;

if neoplasm = 1;


UNFORTUNATELY, THIS IS NOT GRABBING ALL THE DATA. ANY HELP HERE?

ALSO, HOW DO I DO I GET A COUNT FOR ALL MY CODES, FROM THE COMBINATION OF
VARIABLES DIAG1, DIAG2, AND DIAG3? I WANT TO KNOW WHAT IS THE CODE MOST
FREQUENTLY USED REGARDLESS WHICH VARIABLE IT IS CODED IN?

THANKS FOR YOUR HELP!!
CONFIDENTIALITY NOTICE: This electronic message contains
information which may be legally confidential and or privileged and
does not in any case represent a firm ENERGY COMMODITY bid or offer
relating thereto which binds the sender without an additional
express written confirmation to that effect. The information is
intended solely for the individual or entity named above and access
by anyone else is unauthorized. If you are not the intended
recipient, any disclosure, copying, distribution, or use of the
contents of this information is prohibited and may be unlawful. If
you have received this electronic transmission in error, please
reply immediately to the sender that you have received the message
in error, and delete it. Thank you.
From: Savian on
On Dec 9, 3:23 pm, tbouss...(a)GMAIL.COM (Trish Bous) wrote:
> Typo in my message:
>
> I WOULD LIKE TO PULL ANY CODE THAT BEGINS WITH 172, 173, 232, OR 7020
>
> Thanks!

Try this regex:

((172|173|7020)[\d\D]+?)\s

It is one of numerous ways to handle it but I think it will suffice.
All of your matches will be in group 1.

Alan
http://www.savian.net
From: Trish Bous on
That was it!

Thank you! Changing the conditional now gives me the numbers I am expecting!

Thanks!