Prev: unsubscribe
Next: MD5 16 octet - how to compute?
From: Derek Cannon on 18 Apr 2010 23:15 >doc = Nokogiri::HTML(open(url)) > table = [] > doc.css("tr").each do |row| > cells = row.css("td").map {|cell| cell.text.strip } > next unless cells.size == 4 > next unless cells[1] == "TBA" > cells.insert(2, "") > table << cells > end This is interesting. So nil is returns for elements that activate the unless statement? > # Assuming you're using Ruby 1.9 > course_info = [] > trs = doc.css('tr') > trs.each.with_index{ |row,i| > tds = row.css('td') > title = ... > prof = ... > days = ... > times = ... > desc = ... > next_row = trs[i+1] > if next_row && next_row.is_a_continuation? > # Add content from next_row to description > # If needed, invalidate next_row so it will be skipped > elsif title && prof && days # If you have all the information you > need > course_info << Course.new( title, prof, days ) > end > } I like this code a lot, however, is it considered less efficient to look at the next row to see if it is a lab for the previous row rather than checking each row to see if it's a lab, and if it is, adding it to the last row instead? That way, every row won't need to check with the next row. -- Posted via http://www.ruby-forum.com/.
From: Phrogz on 19 Apr 2010 22:10
On Apr 18, 9:15 pm, Derek Cannon <novelltermina...(a)gmail.com> wrote: > >doc = Nokogiri::HTML(open(url)) > > table = [] > > doc.css("tr").each do |row| > > cells = row.css("td").map {|cell| cell.text.strip } > > next unless cells.size == 4 > > next unless cells[1] == "TBA" > > cells.insert(2, "") > > table << cells > > end > > This is interesting. So nil is returns for elements that activate the > unless statement? No. Next inside a block is like an early return. The ruby code: do_this unless something is the same as: unless something do_this end is the same as: if !something do_this end So what's happening in the above is that you're creating an array (table), and then looping through the list of rows returned from Nokogiri. For each row, if certain criteria are met (there aren exactly four cells, or the second cell is "TBA") you immediately stop processing that row and move on to the next. ("Next!" shouts the government worker, dismissing you and moving on.) If you didn't bail out early, however, you inject an extra entry into the array of strings and then shove that whole array as a new entry onto the end of the table array. Because you're using the pattern of creating an array and conditionally populating during a traversal, there are no embedded nils to clean up later. For comparison, here's similar code that WOULD leave you with a table with some nil entries (that you'd probably want to #compact afterwards): table = doc.css("tr").map do |row| cells = row.css("td").map {|cell| cell.text.strip } if cells.size==4 && cells[1]=="TBA" cells.insert(2, "") cells end end In the code immediately above, the value of an if statement that matches is the value of the last expression, while the value of an if statement that is not matches is nil. Here's a simpler example showing the difference between map and each- with-injecting: irb(main):001:0> digits = (1..9).to_a => [1, 2, 3, 4, 5, 6, 7, 8, 9] irb(main):002:0> odds = [] => [] irb(main):003:0> digits.each{ |n| if n%2==1 then odds<<n end } => [1, 2, 3, 4, 5, 6, 7, 8, 9] irb(main):004:0> odds => [1, 3, 5, 7, 9] irb(main):005:0> evens = digits.map{ |n| if n%2==0 then n end } => [nil, 2, nil, 4, nil, 6, nil, 8, nil] irb(main):006:0> evens.compact => [2, 4, 6, 8] |