Prev: Pass existing format with multilabel Proc Format
Next: SAS nlmixed for count data with random effect
From: Toby Dunn on 9 Mar 2010 11:56 Adding the ? behind * quanitfier makes it lazy or reluctant instead of greedy. While this is a quick fix to the greedy quantifier problem it doesnt make it the best solution as it does not solve the fact that you are still requiring the regex to backtrack. Which in the example below isnt a big issue but in a large and/or tight looping problem will kill you. A better option would be to use a negation character class thus solving both the greedy and backtracting problem. On Mon, 8 Mar 2010 18:38:05 -0800, xlr82sas <xlr82sas(a)AOL.COM> wrote: >6.13: What does it mean that regexes are greedy? How can I get around >it? > > > Most people mean that greedy regexes match as much as they can. > Technically speaking, it's actually the quantifiers ("?", "*", >"+", > "{}") that are greedy rather than the whole pattern; Perl prefers >local > greed and immediate gratification to overall greed. To get non- >greedy > versions of the same quantifiers, use ("??", "*?", "+?", "{}?"). > > > An example: > > > $s1 = $s2 = "I am very very cold"; > $s1 =~ s/ve.*y //; # I am cold > $s2 =~ s/ve.*?y //; # I am very cold > > > Notice how the second substitution stopped matching as soon as it > encountered "y ". The "*?" quantifier effectively tells the >regular > expression engine to find a match as quickly as possible and pass > control on to whatever is next in line, like you would if you were > playing hot potato. > > In SAS > > Note without the '?' perl is greedy and takes out both 'very' words, >Even when I have 1 specified. With the ? >it takes out the first 'very'. Note we have another option in SAS to >take out 'n; 'very's. >see below > >data _null_; > s1="I am very very cold"; > o1=prxchange('s/ver.*y //',1,s1); > o2=prxchange('s/ver.*?y //',1,s1); > o3=prxchange('s/ver.*?y //',2,s1); > put (o1 o2 o3) (/=); >run; > >30874 data _null_; >30875 s1="I am very very cold"; >30876 o1=prxchange('s/ver.*y //',1,s1); >30877 o2=prxchange('s/ver.*?y //',1,s1); >30878 o3=prxchange('s/ver.*?y //',2,s1); >30879 put (o1 o2 o3) (/=); >30880 run; > >O1=I am cold >O2=I am very cold >O3=I am cold >NOTE: DATA statement used (Total process time): > real time 0.00 seconds > cpu time 0.00 seconds |