From: Johny on
I have a text and would like to split the text into smaller parts,
say into 100 characters each. But if the 100th character is not a
blank ( but word) this must be less than 100 character.That means the
word itself can not be split.
These smaller parts must contains only whole( not split) words.
I was thinking about RegEx but do not know how to find the correct
Regular Expression.
Can anyone help?
Thanks
L.
From: Tim Golden on
On 23/03/2010 10:48, Johny wrote:
> I have a text and would like to split the text into smaller parts,
> say into 100 characters each. But if the 100th character is not a
> blank ( but word) this must be less than 100 character.That means the
> word itself can not be split.
> These smaller parts must contains only whole( not split) words.
> I was thinking about RegEx but do not know how to find the correct
> Regular Expression.
> Can anyone help?
> Thanks
> L.

Have a look at the textwrap module

TJG
From: Tim Chase on
Johny wrote:
> I have a text and would like to split the text into smaller parts,
> say into 100 characters each. But if the 100th character is not a
> blank ( but word) this must be less than 100 character.That means the
> word itself can not be split.
> These smaller parts must contains only whole( not split) words.
> I was thinking about RegEx but do not know how to find the correct
> Regular Expression.

While I suspect you can come close with a regular expression:

import re, random
size = 100
r = re.compile(r'.{1,%i}\b' % size)
# generate a random text string with a mix of word-lengths
words = ['a', 'an', 'the', 'four', 'fives', 'sixsix']
data = ' '.join(random.choice(words) for _ in range(200))
# for each chunk of 100 characters (or fewer
# if on a word-boundary), do something
for bit in r.finditer(data):
chunk = bit.group(0)
print "%i: [%s]" % (len(chunk), chunk)

it may have an EOF fencepost error, so you might have to clean up
the last item. My simple test seemed to show it worked without
cleanup though.

-tkc



 | 
Pages: 1
Prev: short-circuiting any/all ?
Next: Python is cool!!