From: Adam W. on 24 Sep 2009 17:18 I'm trying to scrape some historical data from NOAA's website, but I can't seem to feed it the right form values to get the data out of it. Heres the code: import urllib import urllib2 ## The source page http://www.erh.noaa.gov/bgm/climate/bgm.shtml url = 'http://www.erh.noaa.gov/bgm/climate/pick.php' values = {'month' : 'July', 'year' : '1988'} user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)' headers = { 'User-Agent' : user_agent } data = urllib.urlencode(values) req = urllib2.Request(url, data, headers) response = urllib2.urlopen(req) the_page = response.read() print the_page
From: Jon Clements on 24 Sep 2009 17:46 On 24 Sep, 22:18, "Adam W." <awasile...(a)gmail.com> wrote: > I'm trying to scrape some historical data from NOAA's website, but I > can't seem to feed it the right form values to get the data out of > it. Heres the code: > > import urllib > import urllib2 > > ## The source pagehttp://www.erh.noaa.gov/bgm/climate/bgm.shtml > url = 'http://www.erh.noaa.gov/bgm/climate/pick.php' > values = {'month' : 'July', > 'year' : '1988'} > > user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)' > headers = { 'User-Agent' : user_agent } > > data = urllib.urlencode(values) > req = urllib2.Request(url, data, headers) > response = urllib2.urlopen(req) > the_page = response.read() > print the_page Hint: <select name="month"> <option value="/jan">January</option> <option value="/feb">February</option> <option value="/mar">March</option> <option value="/apr">April</option> <option value="/may">May</option> <option value="/jun">June</option> <option value="/jul">July</option> <option value="/aug">August</option> <option value="/sep">September</option> <option value="/oct">October</option> <option value="/nov">November</option> <option value="/dec">December</option> </select> Jon.
|
Pages: 1 Prev: When is divmod(a,b)[0] == floor(a/b)-1 ? Next: problems compiling pyscopg2 on RHEL4 |