Prev: MBT shoes($62,1:1 quality),online shopping www.promptc.com
Next: Need help with my 1st python program
From: dasacc22 on 9 May 2010 00:24 On May 8, 2:46 pm, Steven D'Aprano <st...(a)REMOVE-THIS- cybersource.com.au> wrote: > On Sat, 08 May 2010 12:15:22 -0700, Wolfram Hinderer wrote: > > On 8 Mai, 20:46, Steven D'Aprano <st...(a)REMOVE-THIS- cybersource.com.au> > > wrote: > > >> def get_leading_whitespace(s): > >> t = s.lstrip() > >> return s[:len(s)-len(t)] > > >> >>> c = get_leading_whitespace(a) > >> >>> assert c == leading_whitespace > > >> Unless your strings are very large, this is likely to be faster than > >> any other pure-Python solution you can come up with. > > > Returning s[:-1 - len(t)] is faster. > > I'm sure it is. Unfortunately, it's also incorrect. > > >>> z = "*****abcde" > >>> z[:-1-5] > '****' > >>> z[:len(z)-5] > > '*****' > > However, s[:-len(t)] should be both faster and correct. > > -- > Steven This is without a doubt faster and simpler then any solution thus far. Thank you for this
From: Steven D'Aprano on 9 May 2010 01:13 On Sat, 08 May 2010 13:46:59 -0700, Mark Dickinson wrote: >> However, s[:-len(t)] should be both faster and correct. > > Unless len(t) == 0, surely? Doh! The hazards of insufficient testing. Thanks for catching that. -- Steven
From: Steven D'Aprano on 9 May 2010 02:25 On Sat, 08 May 2010 14:27:32 -0700, dasacc22 wrote: > U presume entirely to much. I have a preprocessor that normalizes > documents while performing other more complex operations. Theres > nothing buggy about what im doing I didn't *presume* anything, I took your example code and ran it and discovered that it didn't do what you said it was doing. -- Steven
From: Wolfram Hinderer on 9 May 2010 08:58 On 8 Mai, 21:46, Steven D'Aprano <st...(a)REMOVE-THIS- cybersource.com.au> wrote: > On Sat, 08 May 2010 12:15:22 -0700, Wolfram Hinderer wrote: > > Returning s[:-1 - len(t)] is faster. > > I'm sure it is. Unfortunately, it's also incorrect. > However, s[:-len(t)] should be both faster and correct. Ouch. Thanks for correcting me. No, I'll never tell how that -1 crept in...
From: John Machin on 9 May 2010 09:28 dasacc22 <dasacc22 <at> gmail.com> writes: > > U presume entirely to much. I have a preprocessor that normalizes > documents while performing other more complex operations. Theres > nothing buggy about what im doing Are you sure? Your "solution" calculates (the number of leading whitespace characters) + (the number of TRAILING whitespace characters). Problem 1: including TRAILING whitespace. Example: "content" + 3 * " " + "\n" has 4 leading spaces according to your reckoning; should be 0. Fix: use lstrip() instead of strip() Problem 2: assuming all whitespace characters have *effective* width the same as " ". Examples: TAB has width 4 or 8 or whatever you want it to be. There are quite a number of whitespace characters, even when you stick to ASCII. When you look at Unicode, there are heaps more. Here's a list of BMP characters such that character.isspace() is True, showing the Unicode codepoint, the Python repr(), and the name of the character (other than for control characters): U+0009 u'\t' ? U+000A u'\n' ? U+000B u'\x0b' ? U+000C u'\x0c' ? U+000D u'\r' ? U+001C u'\x1c' ? U+001D u'\x1d' ? U+001E u'\x1e' ? U+001F u'\x1f' ? U+0020 u' ' SPACE U+0085 u'\x85' ? U+00A0 u'\xa0' NO-BREAK SPACE U+1680 u'\u1680' OGHAM SPACE MARK U+2000 u'\u2000' EN QUAD U+2001 u'\u2001' EM QUAD U+2002 u'\u2002' EN SPACE U+2003 u'\u2003' EM SPACE U+2004 u'\u2004' THREE-PER-EM SPACE U+2005 u'\u2005' FOUR-PER-EM SPACE U+2006 u'\u2006' SIX-PER-EM SPACE U+2007 u'\u2007' FIGURE SPACE U+2008 u'\u2008' PUNCTUATION SPACE U+2009 u'\u2009' THIN SPACE U+200A u'\u200a' HAIR SPACE U+200B u'\u200b' ZERO WIDTH SPACE U+2028 u'\u2028' LINE SEPARATOR U+2029 u'\u2029' PARAGRAPH SEPARATOR U+202F u'\u202f' NARROW NO-BREAK SPACE U+205F u'\u205f' MEDIUM MATHEMATICAL SPACE U+3000 u'\u3000' IDEOGRAPHIC SPACE Hmmm, looks like all kinds of widths, from zero upwards.
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: MBT shoes($62,1:1 quality),online shopping www.promptc.com Next: Need help with my 1st python program |