From: Nathan Harmston on 2 Apr 2010 19:22 Thanks everyone, the invRegexInf is perfect. Thanks again, Nathan On 1 April 2010 10:17, Gabriel Genellina <gagsl-py2(a)yahoo.com.ar> wrote: > En Wed, 31 Mar 2010 12:23:48 -0300, Paul McGuire <ptmcg(a)austin.rr.com> > escribió: >> >> On Mar 31, 5:49 am, Nathan Harmston <iwanttobeabad...(a)googlemail.com> >> wrote: >>> >>> I have a slightly complicated/medium sized regular expression and I >>> want to generate all possible words that it can match (to compare >>> performance of regex against an acora based matcher). >> >> The pyparsing wiki Examples page includes this regex inverter: >> http://pyparsing.wikispaces.com/file/view/invRegex.py >> >>> From the module header: >> >> # Supports: >> # - {n} and {m,n} repetition, but not unbounded + or * repetition >> # - ? optional elements >> # - [] character ranges >> # - () grouping >> # - | alternation > > I took the liberty of modifying your invRegex.py example, adding support > for infinite repeaters. It depends on two other modules: > > mergeinf.py (from http://code.activestate.com/recipes/577041) provides the > infinite merge operation. > > enumre.py provides the basic functions (merge, prod, repeat, closure) > necessary to enumerate the language generated by a given regular > expression, even if it contains unbounded repeaters like *,+. The key is > to generate shorter strings before longer ones, so in 'a*|b*' it doesn't > get stuck generating infinite a's before any b. > > By example, "(a|bc)*d" corresponds to this code: > > prod( > closure( > merge( > 'a', > prod('b','c'))), > 'd') > > which returns an infinite generator starting with: > > d > ad > aad > bcd > aaad > abcd > bcad > aaaad > aabcd > abcad > bcaad > bcbcd > aaaaad > aaabcd > aabcad > ... > > > I got the idea from > http://userweb.cs.utexas.edu/users/misra/Notes.dir/RegExp.pdf > > Finally, invRegexInf.py is based on your original regex parser. I only > modified the generation part, taking advantage of the above > infrastructure; the parser itself remains almost the same. It essentially > saves oneself the very tedious work of converting a regular expression > into the equivalent sequence of function calls as shown above. (I hope I > got it right: I like pyparsing a lot and use it whenever I feel it's > appropriate, but not as often as to remember the details...) > > -- > Gabriel Genellina > -- > http://mail.python.org/mailman/listinfo/python-list > >
First
|
Prev
|
Pages: 1 2 Prev: No editbox () functionality in python-dialog module? Next: Video is very funny..hhhhhhhhhhhh |