Prev: Need help in python plug-in development
Next: wxPython problem: Can't assign size of plot.PlotCanvas
From: james_027 on 28 Apr 2010 09:02 hi, Any idea how I can replace words in a html file? Meaning only the content will get replace while the html tags, javascript, & css are remain untouch. THanks, James
From: Daniel Fetchinson on 28 Apr 2010 16:03 > Any idea how I can replace words in a html file? Meaning only the > content will get replace while the html tags, javascript, & css are > remain untouch. I'm not sure what you tried and what you haven't but as a first trial you might want to <untested> f = open( 'new.html', 'w' ) f.write( open( 'index.html' ).read( ).replace( 'replace-this', 'with-that' ) ) f.close( ) </untested> HTH, Daniel -- Psss, psss, put it down! - http://www.cafepress.com/putitdown
From: Luap777 on 28 Apr 2010 17:03 On Apr 28, 8:02 am, james_027 <cai.hai...(a)gmail.com> wrote: > hi, > > Any idea how I can replace words in a html file? Meaning only the > content will get replace while the html tags, javascript, & css are > remain untouch. > > THanks, > James You might try cleaning the HTML with uTidy (http:// utidylib.berlios.de/) to make XHTML then using Beautiful Soup (http:// www.crummy.com/software/BeautifulSoup/documentation.html) to process it. If the number of files isn't that large and it's a one-time thing, you might be just as well using search and replace on the directory and previewing each replacement as you go....
From: Cameron Simpson on 28 Apr 2010 17:31 On 28Apr2010 22:03, Daniel Fetchinson <fetchinson(a)googlemail.com> wrote: | > Any idea how I can replace words in a html file? Meaning only the | > content will get replace while the html tags, javascript, & css are | > remain untouch. | | I'm not sure what you tried and what you haven't but as a first trial | you might want to | | <untested> | | f = open( 'new.html', 'w' ) | f.write( open( 'index.html' ).read( ).replace( 'replace-this', 'with-that' ) ) | f.close( ) | | </untested> If 'replace-this' occurs inside the javascript etc or happens to be an HTML tag name, it will get mangled. The OP didn't want that. The only way to get this right is to parse the file, then walk the doc tree enditing only the text parts. The BeautifulSoup module (3rd party, but a single .py file and trivial to fetch and use, though it has some dependencies) does a good job of this, coping even with typical not quite right HTML. It gives you a parse tree you can easily walk, and you can modify it in place and write it straight back out. Cheers, -- Cameron Simpson <cs(a)zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/ The Web site you seek cannot be located but endless others exist - Haiku Error Messages http://www.salonmagazine.com/21st/chal/1998/02/10chal2.html
From: Daniel Fetchinson on 29 Apr 2010 05:38 > | > Any idea how I can replace words in a html file? Meaning only the > | > content will get replace while the html tags, javascript, & css are > | > remain untouch. > | > | I'm not sure what you tried and what you haven't but as a first trial > | you might want to > | > | <untested> > | > | f = open( 'new.html', 'w' ) > | f.write( open( 'index.html' ).read( ).replace( 'replace-this', 'with-that' > ) ) > | f.close( ) > | > | </untested> > > If 'replace-this' occurs inside the javascript etc or happens to be an > HTML tag name, it will get mangled. The OP didn't want that. Correct, that is why I started with "I'm not sure what you tried and what you haven't but as a first trial you might". For instance if the OP wants to replace words which he knows are not in javascript and/or css and he knows that these words are also not in html attribute names/values, etc, etc, then the above approach would work, in which case BeautifulSoup is a gigantic overkill. The OP needs to specify more clearly what he wants, before really useful advice can be given. Cheers, Daniel > The only way to get this right is to parse the file, then walk the doc > tree enditing only the text parts. > > The BeautifulSoup module (3rd party, but a single .py file and trivial to > fetch and use, though it has some dependencies) does a good job of this, > coping even with typical not quite right HTML. It gives you a parse > tree you can easily walk, and you can modify it in place and write it > straight back out. > > Cheers, > -- > Cameron Simpson <cs(a)zip.com.au> DoD#743 > http://www.cskk.ezoshosting.com/cs/ > > The Web site you seek > cannot be located but > endless others exist > - Haiku Error Messages > http://www.salonmagazine.com/21st/chal/1998/02/10chal2.html > -- Psss, psss, put it down! - http://www.cafepress.com/putitdown
|
Next
|
Last
Pages: 1 2 3 Prev: Need help in python plug-in development Next: wxPython problem: Can't assign size of plot.PlotCanvas |