Prev: Posix fortran and the gnu toolchain
Next: Need help with awk regular expression to process chess games
From: Harry on 11 Apr 2010 02:49 Hi, My intent here is to be able to look for fully upper-cased words only, and filter out the rest. I was, thus, expecting the following egrep to fail. $ echo "Abby" | egrep '^[A-Z]+$' Abby But for some reason, egrep is able to match the extended regex for the mixed-case input. Does anyone know what I'm missing here? Many thanks in advance, /HS PS: I'm using Gnu bash 4.0.23 and Gnu grep 2.5.3 on Fedora 11.
From: Sidney Lambe on 11 Apr 2010 03:08 On comp.unix.shell, Harry <simonsharry(a)gmail.com> wrote: > Hi, > > My intent here is to be able to look for fully upper-cased words only, > and filter out the rest. > I was, thus, expecting the following egrep to fail. > > $ echo "Abby" | egrep '^[A-Z]+$' > Abby > > But for some reason, egrep is able to match the extended regex for the > mixed-case input. > > Does anyone know what I'm missing here? > > Many thanks in advance, > /HS > > PS: > I'm using Gnu bash 4.0.23 and Gnu grep 2.5.3 on Fedora 11. $echo Abby | egrep -v '[a-z]+' $echo ABBY | egrep -v '[a-z]+' ABBY Sid
From: Huibert Bol on 11 Apr 2010 03:28 Harry wrote: > My intent here is to be able to look for fully upper-cased words only, > and filter out the rest. > I was, thus, expecting the following egrep to fail. > > $ echo "Abby" | egrep '^[A-Z]+$' > Abby > > But for some reason, egrep is able to match the extended regex for the > mixed-case input. Ranges are only meaningful in the "POSIX" locale, use the character classses instead: grep '^[[:upper:]]+$' -- Huibert "Okay... really not something I needed to see." --Raven
From: Harry on 11 Apr 2010 03:55 On Apr 11, 12:28 pm, Huibert Bol <huibert....(a)quicknet.nl> wrote: > Ranges are only meaningful in the "POSIX" locale, use the character > classses instead: > > grep '^[[:upper:]]+$' Didn't know that, thanks! (Also wondering btw, how come I never ran into this issue before!) Here's what my locale is: $ locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= I tried the easier way, didn't work: $ LC_ALL=POSIX echo "Abby" | egrep '^[A-Z]+$' Abby
From: Harry on 11 Apr 2010 04:00 On Apr 11, 12:08 pm, Sidney Lambe <sidneyla...(a)nospam.invalid> wrote: > $echo Abby | egrep -v '[a-z]+' > $echo ABBY | egrep -v '[a-z]+' > ABBY Both fail to match on my system. I get an exit code of 1. Secondly, the above regex would match pure numbers also! As I said, I was looking to match only (and only) upper-case words.
|
Next
|
Last
Pages: 1 2 3 Prev: Posix fortran and the gnu toolchain Next: Need help with awk regular expression to process chess games |