Prev: cxfreeze on ubuntu 9.10
Next: how to display variables down a stackdump when an exception occurred
From: Dr. Benjamin David Clarke on 25 Mar 2010 03:41 Does anyone know of a way to save the a loaded web page to file after opening it with a webbrowser.open() call? Specifically, what I want to do is get the raw HTML from a web page. This web page uses Javascript. I need the resulting HTML after the Javascript has been run. I've seen a lot about trying to get Python to run Javascript but there doesn't seem to be any promising solution. I can get the raw HTML that I want by saving the page after it has been loaded via the webbrowser.open() call. Is there any way to automate this? Does anyone have any ideas for better approaches to this problem? I don't need ti to be pretty or anything.
From: Irmen de Jong on 25 Mar 2010 04:13 On 3/25/10 8:41 AM, Dr. Benjamin David Clarke wrote: > Does anyone know of a way to save the a loaded web page to file after > opening it with a webbrowser.open() call? > > Specifically, what I want to do is get the raw HTML from a web page. > This web page uses Javascript. I need the resulting HTML after the > Javascript has been run. I've seen a lot about trying to get Python to > run Javascript but there doesn't seem to be any promising solution. I > can get the raw HTML that I want by saving the page after it has been > loaded via the webbrowser.open() call. Is there any way to automate > this? Does anyone have any ideas for better approaches to this > problem? I don't need ti to be pretty or anything. I think I would use an appropriate GUI automation library to simulate user interaction with the web browser that you just started, and e.g. select the File > Save page as > HTML only menu option from the browser... If the javascript heavily modifies the DOM, that might not work however. You might need additional tooling such as Web Developer Toolbar for Firefox where you then can View Source > View Generated Source. irmen
|
Pages: 1 Prev: cxfreeze on ubuntu 9.10 Next: how to display variables down a stackdump when an exception occurred |