From: Zhang Weiwu on 5 Aug 2010 22:00 On 2010年08月04日 00:57, Bob McGowan wrote: > I would suspect > the regex engine is still honoring '. (dot) does not match newline' > convention but is OK with literals, if present. > It can be a bug in grep implementation. If your theory holds true, the following should match, but it doesn't. $ printf "a\nb" | grep -z 'a[^a]*b' $ Because, if dot does not match newline, like in Java (verified in Java-based JEdit editor), then dot is equal to [^\n] In that case a easy way to workaround it is to replace dot with [^[:something-horribly-non-existent:]] P.S. A interesting design might be: for an RE implementation where dot does not match newline, the very RE implementation should allow [^] to mean really "matches anything". But the java people didn't do so, they introduced "DOTALL" mode, if you enter this mode, dot means anything, otherwise it means [^\n]. This new mode only makes things more complicated to my understanding. -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/4C5B6C85.7080102(a)realss.com
From: Bob McGowan on 6 Aug 2010 18:50 On 08/05/2010 06:49 PM, Zhang Weiwu wrote: > On 2010年08月04日 04:55, Bob McGowan wrote: >> In fact, the LC_ names all seem to be specific to things >> that would not necessarily impact the regex operation. >> > It is not totally true. The encoding part might. If it is UTF-8, in > theory, [:digit:] should match more than 0-9. It might, for example, > mache 一-十 (Chinese digits). > My point is that changing only the LANG environment variable changed the way 'grep' dealt with the newline character. I admittedly did not use a very strict interpretation or understanding of the LC_ variables, you could say I arbitrarily decided to go top down in trying changes. Either way, I got lucky, the first choice changed the be behavior. The others may also change things, but it didn't seem relevant to try every one, as changing one was enough to prove the point. -- Bob McGowan -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/4C5C8F90.50703(a)symantec.com
From: Alan Greenberger on 13 Aug 2010 14:10 On 2010-08-02, Zhang Weiwu <zhangweiwu(a)realss.com> wrote: > > however, it does not match the following: > > select * from mytable where id=1 > and name='foo'"; > How about sed -n '{/id=1/N;/name=.foo/p;d;}' -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/slrni6b05b.ada.alanjg(a)archduke.router
First
|
Prev
|
Pages: 1 2 3 Prev: udev: renamed network interface eth0 to eth1 Next: Setting up local Debian mirror |