From: galileo228 on 11 Feb 2010 14:11 Hey All, Been teaching myself Python for a few weeks, and am trying to write a program that will go to a url, enter a string in one of the search fields, submit the search, and return the contents of the search result. I'm using httplib2. My two particular questions: 1) When I set my 'body' var, (i.e. 'body = {'query':'search_term'}), how do I know what the particular key should be? In other words, how do I tell python which form on the web page I'm visiting I'd like to fill in? Do I simply go to the webpage itself and look at the html source? But if that's the case, which tag tells me the name of the key? 2) Even once python fills in the form properly, how can I tell it to 'submit' the search? Thanks all! Matt
From: Ken Seehart on 11 Feb 2010 14:58 "Use tamperdata to view and modify HTTP/HTTPS headers and post parameters... " https://addons.mozilla.org/en-US/firefox/addon/966 Enjoy, Ken galileo228 wrote: > Hey All, > > Been teaching myself Python for a few weeks, and am trying to write a > program that will go to a url, enter a string in one of the search > fields, submit the search, and return the contents of the search > result. > > I'm using httplib2. > > My two particular questions: > > 1) When I set my 'body' var, (i.e. 'body = {'query':'search_term'}), > how do I know what the particular key should be? In other words, how > do I tell python which form on the web page I'm visiting I'd like to > fill in? Do I simply go to the webpage itself and look at the html > source? But if that's the case, which tag tells me the name of the > key? > > 2) Even once python fills in the form properly, how can I tell it to > 'submit' the search? > > Thanks all! > > Matt >
From: Terry Reedy on 11 Feb 2010 18:20 On 2/11/2010 2:11 PM, galileo228 wrote: > Hey All, > > Been teaching myself Python for a few weeks, and am trying to write a > program that will go to a url, enter a string in one of the search > fields, submit the search, and return the contents of the search > result. > > I'm using httplib2. > > My two particular questions: > > 1) When I set my 'body' var, (i.e. 'body = {'query':'search_term'}), > how do I know what the particular key should be? In other words, how > do I tell python which form on the web page I'm visiting I'd like to > fill in? Do I simply go to the webpage itself and look at the html > source? But if that's the case, which tag tells me the name of the > key? > > 2) Even once python fills in the form properly, how can I tell it to > 'submit' the search? This http://groups.csail.mit.edu/uid/sikuli/ *might* help you.
From: Javier Collado on 12 Feb 2010 02:59 Hello, I haven't used httplib2, but you can certainly use any other alternative to send HTTP requests: - urllib/urllib2 - mechanize With regard to how do you find the form you're looking for, you may: - create the HTTP request on your own with urllib2. To find out what variables do you need to post, you can use tamperdata Firefox addon as suggested (I haven't used that one) or httpfox (I have and it works great). - use mechanize to locate the form for you, fill the data in and click on the submit button. Additionally, you may wan to scrape some data that may be useful for your requests. For that BeautifulSoup is good solution (with some Firebug help to visually locate what you're looking for). Best regards, Javier P.S. Some examples here: http://www.packtpub.com/article/web-scraping-with-python http://www.packtpub.com/article/web-scraping-with-python-part-2 2010/2/11 galileo228 <mattbarkan(a)gmail.com>: > Hey All, > > Been teaching myself Python for a few weeks, and am trying to write a > program that will go to a url, enter a string in one of the search > fields, submit the search, and return the contents of the search > result. > > I'm using httplib2. > > My two particular questions: > > 1) When I set my 'body' var, (i.e. 'body = {'query':'search_term'}), > how do I know what the particular key should be? In other words, how > do I tell python which form on the web page I'm visiting I'd like to > fill in? Do I simply go to the webpage itself and look at the html > source? But if that's the case, which tag tells me the name of the > key? > > 2) Even once python fills in the form properly, how can I tell it to > 'submit' the search? > > Thanks all! > > Matt > -- > http://mail.python.org/mailman/listinfo/python-list >
From: galileo228 on 13 Feb 2010 15:03
Thank you all for your responses, and Javier thank you for your longer response. I've just downloaded mechanize and beautifulsoup and will start to play around. From a pure learning standpoint, however, I'd really like to learn how to use the python post method (without mechanize) to go to a webpage, fill in a form, click 'submit', follow the redirect to the results page, and download content. For example, if I go to google.com, use firebug and click on the search bar, the following HTML is highlighted: <input value="" title="Google Search" class="lst" size="55" name="q" maxlength="2048" onblur="google&&google.fade&&google.fade()" autocomplete="off"> So if I were to use the 'post' method, how can I tell from the code above what the ID of the searchbar is? Is it 'value', 'name', or neither? Assuming that the ID is 'name', then to search google for the term 'olypmics' would the proper code be: import httplib2 import urllib data = {'q':'olympics'} body = urllib.urlencode(data) h = httplib2.Http() resp, content = h.request("http://www.google.com", method="POST", body=body) print content; Does content return the content of the 'search results' page? And if not, how do I tell python to do that? Finally, must I transmit headers, or are they optional? Thanks all for your continued help! Matt On Feb 12, 2:59 am, Javier Collado <javier.coll...(a)gmail.com> wrote: > Hello, > > I haven't used httplib2, but you can certainly use any other > alternative to send HTTP requests: > - urllib/urllib2 > - mechanize > > With regard to how do you find the form you're looking for, you may: > - create the HTTP request on your own with urllib2. To find out what > variables do you need to post, you can use tamperdata Firefox addon as > suggested (I haven't used that one) or httpfox (I have and it works > great). > - use mechanize to locate the form for you, fill the data in and click > on the submit button. > > Additionally, you may wan to scrape some data that may be useful for > your requests. For that BeautifulSoup is good solution (with some > Firebug help to visually locate what you're looking for). > > Best regards, > Javier > > P.S. Some examples here:http://www.packtpub.com/article/web-scraping-with-pythonhttp://www.packtpub.com/article/web-scraping-with-python-part-2 > > 2010/2/11 galileo228 <mattbar...(a)gmail.com>: > > > Hey All, > > > Been teaching myself Python for a few weeks, and am trying to write a > > program that will go to a url, enter a string in one of the search > > fields, submit the search, and return the contents of the search > > result. > > > I'm using httplib2. > > > My two particular questions: > > > 1) When I set my 'body' var, (i.e. 'body = {'query':'search_term'}), > > how do I know what the particular key should be? In other words, how > > do I tell python which form on the web page I'm visiting I'd like to > > fill in? Do I simply go to the webpage itself and look at the html > > source? But if that's the case, which tag tells me the name of the > > key? > > > 2) Even once python fills in the form properly, how can I tell it to > > 'submit' the search? > > > Thanks all! > > > Matt > > -- > >http://mail.python.org/mailman/listinfo/python-list > > |