Prev: subprocess in Command promt+ webbrowser
Next: MAKE UPTO $5000 PER MONTH! $2000 IN FIRST 30 DAYS!
From: davidgp on 19 Jun 2010 06:23 hello, i'm new on this group, and quiet new to python! i'm trying to scrap some adress data from bundes-telefonbuch.de but i run into a problem: the link is like this: http://www.bundes-telefonbuch.de/cgi-btbneu/chtml/chtml?WA=20 and it is basically the same for every search query. thus i need to submit post data to the webserver, i try to do this like this: opener = urllib2.build_opener() opener.addheaders = [('User-Agent', 'Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.4 (like Gecko)')] urllib2.install_opener(opener) data = urllib.urlencode({'F0': 'mySearchKeyword','B': 'T','F8': 'A || G','W': '1','Z': '0','HA': '10','SAS_static_0_treffer_treffer': 'Suche starten','S': '1','translationtemplate': 'checkstrasse'}) url = 'http://www.bundes-telefonbuch.de/cgi-btbneu/chtml/chtml?WA=20' response = urllib2.urlopen(url, data) this returns a page saying i have to reenter my search terms.. what's going wrong here? Thanks!!
From: Rebelo on 19 Jun 2010 08:16 On 19 lip, 12:23, davidgp <davidvanijzendo...(a)gmail.com> wrote: > hello, i'm new on this group, and quiet new to python! > i'm trying to scrap some adress data from bundes-telefonbuch.de but i > run into a problem: > the link is like this:http://www.bundes-telefonbuch.de/cgi-btbneu/chtml/chtml?WA=20 > and it is basically the same for every search query. > thus i need to submit post data to the webserver, i try to do this > like this: > > opener = urllib2.build_opener() > opener.addheaders = [('User-Agent', 'Mozilla/5.0 (compatible; > Konqueror/3.5; Linux) KHTML/3.5.4 (like Gecko)')] > urllib2.install_opener(opener) > > data = urllib.urlencode({'F0': 'mySearchKeyword','B': 'T','F8': 'A || > G','W': '1','Z': '0','HA': '10','SAS_static_0_treffer_treffer': 'Suche > starten','S': '1','translationtemplate': 'checkstrasse'}) > > url = 'http://www.bundes-telefonbuch.de/cgi-btbneu/chtml/chtml?WA=20' > response = urllib2.urlopen(url, data) > > this returns a page saying i have to reenter my search terms.. > what's going wrong here? > > Thanks!! Try mechanize : http://wwwsearch.sourceforge.net/mechanize/ import mechanize response = mechanize.urlopen("http://www.bundes-telefonbuch.de/") forms = mechanize.ParseResponse(response, backwards_compat=False) form = forms[0] form["F0"] = "query" #enter query html = mechanize.urlopen(form.click()).read() f = open("tmp.html","w") f.writelines(html) f.close() Or you can try to parse response but I think that their HTML is not valid
From: Michael Torrie on 19 Jun 2010 11:02 On 06/19/2010 04:23 AM, davidgp wrote: > opener = urllib2.build_opener() > opener.addheaders = [('User-Agent', 'Mozilla/5.0 (compatible; > Konqueror/3.5; Linux) KHTML/3.5.4 (like Gecko)')] > urllib2.install_opener(opener) > > data = urllib.urlencode({'F0': 'mySearchKeyword','B': 'T','F8': 'A || > G','W': '1','Z': '0','HA': '10','SAS_static_0_treffer_treffer': 'Suche > starten','S': '1','translationtemplate': 'checkstrasse'}) > > url = 'http://www.bundes-telefonbuch.de/cgi-btbneu/chtml/chtml?WA=20' > response = urllib2.urlopen(url, data) > > this returns a page saying i have to reenter my search terms.. > what's going wrong here? Most likely you need a cookie. You'll probably have to set up a cookie store for use with urllib2, then request the page that the search form is on so that the cookie is generated, and then make your post with your search terms.
|
Pages: 1 Prev: subprocess in Command promt+ webbrowser Next: MAKE UPTO $5000 PER MONTH! $2000 IN FIRST 30 DAYS! |