Prev: short-circuiting any/all ?
Next: Python is cool!!
From: Johny on 23 Mar 2010 06:48 I have a text and would like to split the text into smaller parts, say into 100 characters each. But if the 100th character is not a blank ( but word) this must be less than 100 character.That means the word itself can not be split. These smaller parts must contains only whole( not split) words. I was thinking about RegEx but do not know how to find the correct Regular Expression. Can anyone help? Thanks L.
From: Tim Golden on 23 Mar 2010 06:54 On 23/03/2010 10:48, Johny wrote: > I have a text and would like to split the text into smaller parts, > say into 100 characters each. But if the 100th character is not a > blank ( but word) this must be less than 100 character.That means the > word itself can not be split. > These smaller parts must contains only whole( not split) words. > I was thinking about RegEx but do not know how to find the correct > Regular Expression. > Can anyone help? > Thanks > L. Have a look at the textwrap module TJG
From: Tim Chase on 23 Mar 2010 13:31 Johny wrote: > I have a text and would like to split the text into smaller parts, > say into 100 characters each. But if the 100th character is not a > blank ( but word) this must be less than 100 character.That means the > word itself can not be split. > These smaller parts must contains only whole( not split) words. > I was thinking about RegEx but do not know how to find the correct > Regular Expression. While I suspect you can come close with a regular expression: import re, random size = 100 r = re.compile(r'.{1,%i}\b' % size) # generate a random text string with a mix of word-lengths words = ['a', 'an', 'the', 'four', 'fives', 'sixsix'] data = ' '.join(random.choice(words) for _ in range(200)) # for each chunk of 100 characters (or fewer # if on a word-boundary), do something for bit in r.finditer(data): chunk = bit.group(0) print "%i: [%s]" % (len(chunk), chunk) it may have an EOF fencepost error, so you might have to clean up the last item. My simple test seemed to show it worked without cleanup though. -tkc
|
Pages: 1 Prev: short-circuiting any/all ? Next: Python is cool!! |