Prev: NYC LOCAL: Tuesday 30 March 2010 NYLUG Hackfest
Next: NYC LOCAL: Friday 2 April 2010 Student and Startup Hackathon NYC
From: Mike Scott on 1 Apr 2010 10:39 I've been trying without success to make a back reference regular expression work in milter-regex, but it seems I misunderstand something more basic. Using old-style re's, an re like \(..*\)\1 will match a string 123abcabc456 returning abcabc That works fine in a trivial test program. I'm using fbsd 6.x, and compiling an re with REG_EXTENDED always fails to match whenever \1 appears in the re. So (..*)\1 fails to match the test string above, (while (..*) will match the whole (of course) string). The man page isn't entirely clear about whether the 'new atom type' means back-referencing is included or excluded in the extended re syntax; the web suggests it should work with REG_EXTENDED set. Is this behaviour correct with fbsd 6 please? And have things changed since? I do see the man page mentions 'alpha quality' which sounds ominous :-( Either way, milter-regex doesn't seem to like the use of the \1 construct - is this disallowed by that program for some reason? Oh, and using 'old-style' re's, \(.*\)\1 matches 123abcabc456 but returns a null string as the match! Wierd. TIA. -- Mike Scott (unet2 <at> [deletethis] scottsonline.org.uk) Harlow Essex England
From: Johan van Selst on 1 Apr 2010 16:48 Once upon a newsgroup, Mike Scott claimed: > Using old-style re's, an re like > \(..*\)\1 > will match a string > 123abcabc456 > > I'm using fbsd 6.x, and compiling an re with REG_EXTENDED always fails > to match whenever \1 appears in the re. Indeed, extended regular expressions do not work with back references. You will see the same behaviour with sed (sed -E) and grep (egrep). If you need back references, then you must use the old style 'basic' regular expressions (where possible). The new, 'extended' regular expressions are generally faster and more useful though, as long as you do not need this feature. Ciao, Johan -- Why do we always come here - I guess we'll never know. It's like a kind of torture to have to watch the show.
From: Randal L. Schwartz on 1 Apr 2010 17:48 >>>>> "Johan" == Johan van Selst <{c.u.b.f.m.}@news.gletsjer.org> writes: Johan> If you need back references, then you must use the old style 'basic' Johan> regular expressions (where possible). The new, 'extended' regular Johan> expressions are generally faster and more useful though, as long as Johan> you do not need this feature. Or, just use Perl, where you can have the kitchen sink... -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn(a)stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc. See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
From: mikea on 1 Apr 2010 18:15 Randal L. Schwartz <merlyn(a)stonehenge.com> wrote in <86iq8au9nt.fsf(a)red.stonehenge.com>: >>>>>> "Johan" == Johan van Selst <{c.u.b.f.m.}@news.gletsjer.org> writes: > > Johan> If you need back references, then you must use the old style 'basic' > Johan> regular expressions (where possible). The new, 'extended' regular > Johan> expressions are generally faster and more useful though, as long as > Johan> you do not need this feature. > > Or, just use Perl, where you can have the kitchen sink... Well, yes, but the OP and I are using milter-regex, and he asked in the context of milter-regex. We don't get to choose which regex engine is being used, unless we do some rather determined hackery on the product. I grant I _have_ done some minor hackery already, but nothing so complex as changing to a different regex engine. The prospect rather daunts me. -- French does have a certain je ne sais quoi, but I don't know what it is. -- Jeffrey Goldberg, in nanae
From: Balwinder S Dheeman on 1 Apr 2010 18:15
On 04/02/2010 02:18 AM, Johan van Selst wrote: > Once upon a newsgroup, Mike Scott claimed: >> Using old-style re's, an re like >> \(..*\)\1 >> will match a string >> 123abcabc456 >> >> I'm using fbsd 6.x, and compiling an re with REG_EXTENDED always fails >> to match whenever \1 appears in the re. > > Indeed, extended regular expressions do not work with back references. > You will see the same behaviour with sed (sed -E) and grep (egrep). > > If you need back references, then you must use the old style 'basic' > regular expressions (where possible). The new, 'extended' regular > expressions are generally faster and more useful though, as long as > you do not need this feature. How about using pcre's pgrep? -- Balwinder S "bdheeman" Dheeman Registered Linux User: #229709 Anu'z Linux(a)HOME (Unix Shoppe) Machines: #168573, 170593, 259192 Chandigarh, UT, 160062, India Plan9, T2, Arch/Debian/FreeBSD/XP Home: http://werc.homelinux.net/ Visit: http://counter.li.org/ |