Prev: How to use macro variable in libname statatement.
Next: how can you get counts using first .and last. in datastep
From: karma on 7 Jan 2010 08:12 Hi Daniel, I'm not aware of a way in regexp to count the number of matches without looping through the string. To get around this I would try replacing all matches with a character not likely to be in the string (# in this example) and then counting the number of occurences using the countc function. In regexp you can use the pipe character (|) to specify alternative patterns and the i option to ignore the case of the match. Hope this helps data have; input idnum topologia $40.; pos = countc(prxchange('s/A90|A80|F12/#/i',-1,topologia),'#') ; neg = countc(prxchange('s/B40|A89|W45|K10/#/i',-1,topologia),'#') ; cards; 1 A90/A80/B40/F30 2 B40/F30/A89 3 A90/A87/A80/F30/K10 ; run; proc print; run ; 2010/1/7 Daniel Fern�ndez <fdezdan(a)gmail.com>: > hi all, > > I am trying to count words from a string (if they match with any value > of my TWO lists of identifiers). > > > data have; > input idnum topologia $40.; > cards; > 1 A90/A80/B40/F30 > 2 B40/F30/A89 > 3 A90/A87/A80/F30/K10 > ; > run; > > If the values (all uppercase) for each of my two lists are: > POSITIVE in (A90,A80,F12) > NEGATIVE in (B40,A89,W45,K10) > .. I would like the output looks like: > > idnum positive negative > 1 2 1 > 2 0 2 > 3 2 1 > > > I know it is perfect opportunity to use perl regular expressions but > my knowledge about them > resumes only to matching and parsing. > > Thanks in advance! > > Daniel Fernandez > Barcelona >
From: karma on 7 Jan 2010 08:56 Hi Daniel, Glad it worked for you. The -1 argument tells prxchange to replace ALL occurences of the pattern with the substitution text. Thanks Karma 2010/1/7 Daniel Fern�ndez <fdezdan(a)gmail.com>: > Thank you very much Karma! > > I thought about using countc (but it counts 1 character length only); > your countc function with prxchange function combination made it > powerful to solve my problem. > I know pipe charcater (or ! ) and 'i' option , but what is -1 at the > second argument for? > > all the best, > Daniel fernandez > > > El d�a 7 de enero de 2010 14:12, karma <dorjetarap(a)googlemail.com> escribi�: >> Hi Daniel, >> >> I'm not aware of a way in regexp to count the number of matches >> without looping through the string. To get around this I would try >> replacing all matches with a character not likely to be in the string >> (# in this example) and then counting the number of occurences using >> the countc function. >> >> In regexp you can use the pipe character (|) to specify alternative >> patterns and the i option to ignore the case of the match. >> >> Hope this helps >> >> data have; >> input idnum topologia $40.; >> pos = countc(prxchange('s/A90|A80|F12/#/i',-1,topologia),'#') ; >> neg = countc(prxchange('s/B40|A89|W45|K10/#/i',-1,topologia),'#') ; >> cards; >> 1 A90/A80/B40/F30 >> 2 B40/F30/A89 >> 3 A90/A87/A80/F30/K10 >> ; >> run; >> proc print; run ; >> >> 2010/1/7 Daniel Fern�ndez <fdezdan(a)gmail.com>: >>> hi all, >>> >>> I am trying to count words from a string (if they match with any value >>> of my TWO lists of identifiers). >>> >>> >>> data have; >>> input idnum topologia $40.; >>> cards; >>> 1 A90/A80/B40/F30 >>> 2 B40/F30/A89 >>> 3 A90/A87/A80/F30/K10 >>> ; >>> run; >>> >>> If the values (all uppercase) for each of my two lists are: >>> POSITIVE in (A90,A80,F12) >>> NEGATIVE in (B40,A89,W45,K10) >>> .. I would like the output looks like: >>> >>> idnum positive negative >>> 1 2 1 >>> 2 0 2 >>> 3 2 1 >>> >>> >>> I know it is perfect opportunity to use perl regular expressions but >>> my knowledge about them >>> resumes only to matching and parsing. >>> >>> Thanks in advance! >>> >>> Daniel Fernandez >>> Barcelona >>> >> >
From: =?ISO-8859-1?Q?Daniel_Fern=E1ndez?= on 7 Jan 2010 08:52
Thank you very much Karma! I thought about using countc (but it counts 1 character length only); your countc function with prxchange function combination made it powerful to solve my problem. I know pipe charcater (or ! ) and 'i' option , but what is -1 at the second argument for? all the best, Daniel fernandez El d�a 7 de enero de 2010 14:12, karma <dorjetarap(a)googlemail.com> escribi�: > Hi Daniel, > > I'm not aware of a way in regexp to count the number of matches > without looping through the string. To get around this I would try > replacing all matches with a character not likely to be in the string > (# in this example) and then counting the number of occurences using > the countc function. > > In regexp you can use the pipe character (|) to specify alternative > patterns and the i option to ignore the case of the match. > > Hope this helps > > data have; > input idnum topologia $40.; > pos = countc(prxchange('s/A90|A80|F12/#/i',-1,topologia),'#') ; > neg = countc(prxchange('s/B40|A89|W45|K10/#/i',-1,topologia),'#') ; > cards; > 1 A90/A80/B40/F30 > 2 B40/F30/A89 > 3 A90/A87/A80/F30/K10 > ; > run; > proc print; run ; > > 2010/1/7 Daniel Fern�ndez <fdezdan(a)gmail.com>: >> hi all, >> >> I am trying to count words from a string (if they match with any value >> of my TWO lists of identifiers). >> >> >> data have; >> input idnum topologia $40.; >> cards; >> 1 A90/A80/B40/F30 >> 2 B40/F30/A89 >> 3 A90/A87/A80/F30/K10 >> ; >> run; >> >> If the values (all uppercase) for each of my two lists are: >> POSITIVE in (A90,A80,F12) >> NEGATIVE in (B40,A89,W45,K10) >> .. I would like the output looks like: >> >> idnum positive negative >> 1 2 1 >> 2 0 2 >> 3 2 1 >> >> >> I know it is perfect opportunity to use perl regular expressions but >> my knowledge about them >> resumes only to matching and parsing. >> >> Thanks in advance! >> >> Daniel Fernandez >> Barcelona >> > |