From: andrew cooke on

This is a bit embarassing, but I seem to be misunderstanding how \b
works in regexps.

Please can someone explain why the following fails:

from re import compile

p = compile(r'\bword\b')
m = p.match(' word ')
assert m

My understanding is that \b matches a space at the start or end of a
word, and that "word" is a word - http://docs.python.org/library/re.html

What am I missing here? I suspect I am doing something very stupid.

Thanks,
Andrew
From: Duncan Booth on
andrew cooke <andrew(a)acooke.org> wrote:

> Please can someone explain why the following fails:
>
> from re import compile
>
> p = compile(r'\bword\b')
> m = p.match(' word ')
> assert m
>
> My understanding is that \b matches a space at the start or end of a
> word, and that "word" is a word - http://docs.python.org/library/re.html
>
> What am I missing here? I suspect I am doing something very stupid.
>

You misunderstand what \b does: it doesn't match a space, it matches a 0
length string on a boundary between a non-word and a word.

Try:

p.match(' word ', 1).group(0)

and you'll see that you are only match the word, not the surrounding
puctuation.
From: andrew cooke on
On May 29, 11:24 am, Duncan Booth <duncan.bo...(a)invalid.invalid>
wrote:
> andrew cooke <and...(a)acooke.org> wrote:
> > Please can someone explain why the following fails:
>
> >         from re import compile
>
> >         p = compile(r'\bword\b')
> >         m = p.match(' word ')
> >         assert m
[...]
> You misunderstand what \b does: it doesn't match a space, it matches a 0
> length string on a boundary between a non-word and a word.
[...]

That's what I thought it did... Then I read the docs and confused
"empty string" with "space"(!) and convinced myself otherwise. I
think I am going senile.

Thanks very much!
Andrew
From: John Machin on
On May 30, 1:30 am, andrew cooke <and...(a)acooke.org> wrote:

>
> That's what I thought it did...  Then I read the docs and confused
> "empty string" with "space"(!) and convinced myself otherwise.  I
> think I am going senile.

Not necessarily. Conflating concepts like "string containing
whitespace", "string containing space(s)", "empty aka 0-length
string", None, (ASCII) NUL, and (SQL) NULL appears to be an age-
independent problem :-)