perl regex :URGENT [SAS]

Prev: How to use macro variable in libname statatement.
Next: how can you get counts using first .and last. in datastep

From: karma on 7 Jan 2010 08:12

Hi Daniel,

I'm not aware of a way in regexp to count the number of matches
without looping through the string. To get around this I would try
replacing all matches with a character not likely to be in the string
(# in this example) and then counting the number of occurences using
the countc function.

In regexp you can use the pipe character (|) to specify alternative
patterns and the i option to ignore the case of the match.

Hope this helps

data have;
input idnum topologia $40.;
pos = countc(prxchange('s/A90|A80|F12/#/i',-1,topologia),'#') ;
neg = countc(prxchange('s/B40|A89|W45|K10/#/i',-1,topologia),'#') ;
cards;
1 A90/A80/B40/F30
2 B40/F30/A89
3 A90/A87/A80/F30/K10
;
run;
proc print; run ;

2010/1/7 Daniel Fern�ndez <fdezdan(a)gmail.com>:
> hi all,
>
> I am trying to count words from a string (if they match with any value
> of my TWO lists of identifiers).
>
>
> data have;
> input idnum topologia $40.;
> cards;
> 1 A90/A80/B40/F30
> 2 B40/F30/A89
> 3 A90/A87/A80/F30/K10
> ;
> run;
>
> If the values (all uppercase) for each of my two lists are:
> POSITIVE in (A90,A80,F12)
> NEGATIVE in (B40,A89,W45,K10)
> .. I would like the output looks like:
>
> idnum positive negative
> 1 2 1
> 2 0 2
> 3 2 1
>
>
> I know it is perfect opportunity to use perl regular expressions but
> my knowledge about them
> resumes only to matching and parsing.
>
> Thanks in advance!
>
> Daniel Fernandez
> Barcelona
>

From: karma on 7 Jan 2010 08:56

Hi Daniel,

Glad it worked for you. The -1 argument tells prxchange to replace ALL
occurences of the pattern with the substitution text.

Thanks

Karma

2010/1/7 Daniel Fern�ndez <fdezdan(a)gmail.com>:
> Thank you very much Karma!
>
> I thought about using countc (but it counts 1 character length only);
> your countc function with prxchange function combination made it
> powerful to solve my problem.
> I know pipe charcater (or ! ) and 'i' option , but what is -1 at the
> second argument for?
>
> all the best,
> Daniel fernandez
>
>
> El d�a 7 de enero de 2010 14:12, karma <dorjetarap(a)googlemail.com> escribi�:
>> Hi Daniel,
>>
>> I'm not aware of a way in regexp to count the number of matches
>> without looping through the string. To get around this I would try
>> replacing all matches with a character not likely to be in the string
>> (# in this example) and then counting the number of occurences using
>> the countc function.
>>
>> In regexp you can use the pipe character (|) to specify alternative
>> patterns and the i option to ignore the case of the match.
>>
>> Hope this helps
>>
>> data have;
>> input idnum topologia $40.;
>> pos = countc(prxchange('s/A90|A80|F12/#/i',-1,topologia),'#') ;
>> neg = countc(prxchange('s/B40|A89|W45|K10/#/i',-1,topologia),'#') ;
>> cards;
>> 1 A90/A80/B40/F30
>> 2 B40/F30/A89
>> 3 A90/A87/A80/F30/K10
>> ;
>> run;
>> proc print; run ;
>>
>> 2010/1/7 Daniel Fern�ndez <fdezdan(a)gmail.com>:
>>> hi all,
>>>
>>> I am trying to count words from a string (if they match with any value
>>> of my TWO lists of identifiers).
>>>
>>>
>>> data have;
>>> input idnum topologia $40.;
>>> cards;
>>> 1 A90/A80/B40/F30
>>> 2 B40/F30/A89
>>> 3 A90/A87/A80/F30/K10
>>> ;
>>> run;
>>>
>>> If the values (all uppercase) for each of my two lists are:
>>> POSITIVE in (A90,A80,F12)
>>> NEGATIVE in (B40,A89,W45,K10)
>>> .. I would like the output looks like:
>>>
>>> idnum positive negative
>>> 1 2 1
>>> 2 0 2
>>> 3 2 1
>>>
>>>
>>> I know it is perfect opportunity to use perl regular expressions but
>>> my knowledge about them
>>> resumes only to matching and parsing.
>>>
>>> Thanks in advance!
>>>
>>> Daniel Fernandez
>>> Barcelona
>>>
>>
>

From: =?ISO-8859-1?Q?Daniel_Fern=E1ndez?= on 7 Jan 2010 08:52

Thank you very much Karma!

I thought about using countc (but it counts 1 character length only);
your countc function with prxchange function combination made it
powerful to solve my problem.
I know pipe charcater (or ! ) and 'i' option , but what is -1 at the
second argument for?

all the best,
Daniel fernandez

El d�a 7 de enero de 2010 14:12, karma <dorjetarap(a)googlemail.com> escribi�:
> Hi Daniel,
>
> I'm not aware of a way in regexp to count the number of matches
> without looping through the string. To get around this I would try
> replacing all matches with a character not likely to be in the string
> (# in this example) and then counting the number of occurences using
> the countc function.
>
> In regexp you can use the pipe character (|) to specify alternative
> patterns and the i option to ignore the case of the match.
>
> Hope this helps
>
> data have;
> input idnum topologia $40.;
> pos = countc(prxchange('s/A90|A80|F12/#/i',-1,topologia),'#') ;
> neg = countc(prxchange('s/B40|A89|W45|K10/#/i',-1,topologia),'#') ;
> cards;
> 1 A90/A80/B40/F30
> 2 B40/F30/A89
> 3 A90/A87/A80/F30/K10
> ;
> run;
> proc print; run ;
>
> 2010/1/7 Daniel Fern�ndez <fdezdan(a)gmail.com>:
>> hi all,
>>
>> I am trying to count words from a string (if they match with any value
>> of my TWO lists of identifiers).
>>
>>
>> data have;
>> input idnum topologia $40.;
>> cards;
>> 1 A90/A80/B40/F30
>> 2 B40/F30/A89
>> 3 A90/A87/A80/F30/K10
>> ;
>> run;
>>
>> If the values (all uppercase) for each of my two lists are:
>> POSITIVE in (A90,A80,F12)
>> NEGATIVE in (B40,A89,W45,K10)
>> .. I would like the output looks like:
>>
>> idnum positive negative
>> 1 2 1
>> 2 0 2
>> 3 2 1
>>
>>
>> I know it is perfect opportunity to use perl regular expressions but
>> my knowledge about them
>> resumes only to matching and parsing.
>>
>> Thanks in advance!
>>
>> Daniel Fernandez
>> Barcelona
>>
>

|
Pages: 1
Prev: How to use macro variable in libname statatement.
Next: how can you get counts using first .and last. in datastep