Help with regular expression [Perl]

Prev: FAQ 4.1 Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?
Next: FAQ 3.24 Why don't Perl one-liners work on my DOS/Mac/VMS system?

From: Eric Pozharski on 26 Jul 2010 03:19

with <i2i2o4$14uv$1(a)adenine.netfront.net> Mark Hobley wrote:
> I need a regular expression with the following properties.
> I need to match text (typically, though not necessarily expressions)
> enclosed within double parentheses. However, I do not want to match nested
> single parentheses enclosed text.
>
> So ((*)) is a match, but ((*)*(*)) is not a match.
> Here are some examples to illustrate this.
>
> ((FOO)) - This is a match
> (()) - This is a match
> ((3 + 2)) - This is a match
> ((3 + 2) + (2 * foo)) - This is not a match
> ((3 * bar) + ((foo))) - This is a match
> ((3 * bar) + ((foo))bar) - This is a match.
>
> I hope that lot makes sense.

I dare to speculate that won't make sense tomorrow. Achieve grammar.
Then make parser.

--
Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom

From: Ben Morrow on 26 Jul 2010 05:33

Quoth Ilya Zakharevich <nospam-abuse(a)ilyaz.org>:
> On 2010-07-25, Ben Morrow <ben(a)morrow.me.uk> wrote:
> > Is there something wrong with /$\([^(]*$\)/ ?
> >
> > (Hmm, that's *seriously* unreadable.)
>
> Today, I ruined one of my most beautiful RExes:
>
> qr{([<>])}
>
> for parsing POD. To treat mismatched < and >, one actually needs to
> through in \z as well.
>
> What is the moral? I do not know! m{$\([^(]*$\)} is not better,
> right? Fontification by CPerl helps a little bit, of course, but not
> much.

Yes (well, Vim rather than CPerl, of course... :) ).

/\Q((\E [^(]* \Q))\E/x is a little better, but not much. I think if
this was any longer I'd go the whole way with

my ($op, $cl) = qw/(( ))/;
my $rx = qr/$op [^(]* $cl/x;

but it's not worth it in this case.

> Lisp has the notion of "escaping out of quoting"; so it would look
> something like
>
> m{ (( (?` [^(]* ) )) }xq
>
> assuming //q means /\Q/, except that the part inside (?` ) is not
> quoted...

I like the idea, but not the syntax. It's a little too much like SGML
CDATA for me: 'everything is literal *except* this special magic
sequence you've forgotten about...'. I think if I seriously wanted an
improvement on \Q\E it would look like

/\q{((} [^(]* \q{))}/x

with the character following \q being a delimiter (matched or not)
following the same rules as for m// &c.

Ben

From: sln on 26 Jul 2010 10:29

On 25 Jul 2010 22:36:44 GMT, jt(a)toerring.de (Jens Thoms Toerring) wrote:

>Mark Hobley <markhobley(a)yahoo.donottypethisbit.co> wrote:
>> I need a regular expression with the following properties.
>> I need to match text (typically, though not necessarily expressions)
>> enclosed within double parentheses. However, I do not want to match nested
>> single parentheses enclosed text.
>
>> So ((*)) is a match, but ((*)*(*)) is not a match.
>> Here are some examples to illustrate this.
>
>> ((FOO)) - This is a match
>> (()) - This is a match
>> ((3 + 2)) - This is a match
>> ((3 + 2) + (2 * foo)) - This is not a match
>> ((3 * bar) + ((foo))) - This is a match
>
>Should the whole thing be the match or only the "((foo))" part?
>
>> ((3 * bar) + ((foo))bar) - This is a match.
>
>Same question here
>
>> I hope that lot makes sense.
>
>If in e.g. "((3 * bar) + ((foo)))" only the "((foo))" part is
>meant to be the match then I would think
>
>$\([^(]*$\)
^^
This will match ((foo)), ((foo))) or ((foo))))))))))))))))))))))
Maybe [^()]*

-sln

From: Mark Hobley on 1 Aug 2010 15:47

On Sun, 25 Jul 2010 22:04:25 +0200, Peter J. Holzer wrote:

> Is this a match?
>
> (((1 + 2) * (3 +4)))

Yes. That is a match.

--
Mark Hobley
Linux User: #370818 http://markhobley.yi.org/

--- news://freenews.netfront.net/ - complaints: news(a)netfront.net ---

From: Peter J. Holzer on 1 Aug 2010 18:10

On 2010-08-01 19:47, Mark Hobley <markhobley(a)yahoo.donottypethisbit.co> wrote:
> On Sun, 25 Jul 2010 22:04:25 +0200, Peter J. Holzer wrote:
>
>> Is this a match?
>>
>> (((1 + 2) * (3 +4)))
>
> Yes. That is a match.
>

Then the problem cannot be solved with a real regular expression.

Perl regexps are an extension and can be used to match parentheses, so
it is probably possible to write one which solves your problem. However,
it will almost certainly be very hard to understand (the simple
solutions offered in this thread won't work).

I agree with Eric: Write a proper grammar and use that to parse your
expressions. If you've ever heard of BNF, using Parse::Yapp or
Parse::RecDescent shouldn't be too hard (I prefer the former, although
the docs assume that you are already familiar with yacc).

Alternatively, you can just walk through your string character by
character and note the start and end of each pair of parentheses (you
only need the last one at each level). If you've found two pairs where
start and end only differ by one character, you have a match.

hp

First | Prev | Next | Last
Pages: 1 2 3 4 5
Prev: FAQ 4.1 Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?
Next: FAQ 3.24 Why don't Perl one-liners work on my DOS/Mac/VMS system?